DON BOSCO INSTITUTE OF TECHNOLOGY BENGALUR
DEPARTMENT OF CSE(AI&ML)
ASSIGNMENT 2
Course name: Human Centered AI Course code: BAI613A
Semester: VI A&B Date of Given:21-04-2025 Date of Submission:24-04-2025 Max marks: 10
Module -1 CO RBT
L
1 Explain with a neat diagram the Two Grand Goals of AI Research. CO L2
. 1
Soln:
• Origins of AI Science and Engineering Research
o AI research goals were proposed at least 60 years ago.
o Early conferences gathered those interested in Alan Turing’s question: “Can
Machines Think?”
• Definition of AI Science Research
o Focuses on making computers perform human-like tasks.
o Aims to match or exceed human perceptual, cognitive, and motor abilities.
• Turing Test and Variants
o A machine satisfies the Turing Test if observers cannot distinguish it from a
human in a text-based conversation.
o Variants include:
▪ Generating images indistinguishable from real photos.
▪ Creating humanoid robots that speak, move, and resemble humans.
• Stuart Russell’s Perspective on AI
o AI serves two purposes:
▪ Understanding human intelligence.
▪ Improving civilization.
o Acknowledges challenges in imbuing machines with intelligence.
• Research Areas in AI
o Perceptual, cognitive, and motor abilities research includes:
▪ Pattern recognition (images, speech, facial recognition, signals, etc.).
▪ Natural language processing and translation.
▪ Predictive analytics and emotional recognition in applications.
o AI in gaming:
▪ Playing checkers, chess, Go, and poker at human or superhuman
levels.
• Evolution of AI Science Approaches
o Early AI research focused on symbolic manipulation.
o Shifted to statistical approaches with machine learning and deep learning.
o Advanced neural network strategies include:
▪ Generative Adversarial Networks (GANs).
▪ Convolutional Neural Networks (CNNs).
▪ Recurrent Neural Networks (RNNs).
▪ Inverse Reinforcement Learning (IRL).
▪ Foundation models and their variants.
• Achievements and Criticism of AI
o AI is seen as a historical turning point with both successes and failures.
o Successes:
▪ Optical character recognition.
▪ Speech recognition.
▪ Natural language translation.
o Criticism:
▪ AI innovations are imperfect.
▪ Many AI projects have failed, as is common in ambitious research.
• Traditional Engineering vs. AI Methods
o Some AI successes are attributed to traditional engineering rather than AI.
o Examples:
▪ IBM’s Deep Blue defeated Garry Kasparov (1997), but used brute-
force computing, not AI.
▪ AI-based expert systems in business failed, while human-curated
rule-based systems succeeded.
• Challenges in AI Applications
o Issues with deep learning:
▪ Works well in lab experiments but fails in real-world scenarios.
▪ Criticisms by NYU professors Gary Marcus and Ernest Davis:
▪ AI systems misinterpret photos.
▪ Chatbots display bias.
▪ AI makes poor healthcare recommendations.
▪ Self-driving cars crash into obstacles.
o Despite failures, Marcus and Davis remain optimistic about AI’s future.
o They call for improvements in common-sense reasoning and better AI
development approaches.
• Mitchell Waldrop’s View on AI
o Deep learning is powerful but has limitations.
o Failures highlight AI’s gap from human intelligence.
o Possible solutions:
▪ Improve deep learning strategies.
▪ Expand training datasets.
▪ Take on challenges positively.
• Future of AI and Human-Centered AI (HCAI) Design
o AI is still in its early days after 60 years of research.
o Advocates for Human-Centered AI (HCAI) Design:
▪ Involving stakeholders in design and iterative testing.
▪ Enhancing transparency and human control over AI algorithms.
▪ Implementing explainable user interfaces and audit trails.
▪ Establishing independent oversight for AI decision-making.
• Influence of AI Research Debates
o AI discussions shape:
▪ Government funding.
▪ Commercial projects.
▪ Academic research and teaching.
▪ Public perceptions.
• Two Grand Goals of AI Research
o Science: Understanding intelligence.
o Innovation: Creating beneficial technologies.
• Four Pairs of AI Design Possibilities
o Two main goals lead to four pairs of design metaphors describing AI and
HCAI research.
o All metaphors are valuable but serve different purposes:
o High computer autonomy allows for unattended activity.
o High human control enables human intervention.
o First pair of metaphors:
o Intelligent agents – Represent competent, independent action.
o Supertools – Indicate human control in machine usage.
o Second pair of metaphors:
o Teammates – Suggest human-like actions.
o Tele-bots (tele-operated devices) – Indicate direct human operation.
o Third pair of metaphors:
o Assured autonomy – Ensures safety due to its design.
o Control centers – Ensure safety through human monitoring and intervention.
o Fourth pair of metaphors:
o Social robots – Designed to behave like a human.
o Active appliances – Function like household appliances such as dishwashers
or clothes dryers.
•
o Designs should combine the strengths of automation and human control.
o Aim to create reliable, safe, and trustworthy AI systems.
o Key areas of impact:
▪ Business, education, healthcare, environment, and community
safety.
2 Explain in details Science Goals and Innovation Goals. CO L1
. 1
Soln:
AI Goals by Researchers & Developers
• Stuart Russell & Peter Norvig define AI goals as:
o Think like a human
o Act like a human
o Think rationally
o Act rationally
• David Poole & Alan Mackworth define AI as the study of "computational agents that
act intelligently."
o Their goal: Understand principles enabling intelligent behavior in both
natural and artificial systems.
Different Perspectives on AI
• Some view AI as tools to augment human abilities and creativity.
• The focus can be divided into two major goals:
o Science Goal: Understanding human intelligence and replicating it.
o Innovation Goal: Developing practical, widely used AI applications.
Science Goal
• Core objective: Understand human perceptual, cognitive, and motor abilities to
create AI systems that match or surpass human performance.
• Covers:
o Social robots, common-sense reasoning, affective computing, machine
consciousness, Artificial General Intelligence (AGI).
• Long-term aspirations: Some researchers believe it may take 100–1000 years to fully
replicate human intelligence.
• Philosophy: Humans are viewed as sophisticated machines that can be emulated.
• AI 100 Report Perspective: Human and computer intelligence differ in scale, speed,
autonomy, and generality but are fundamentally the same.
• Historical Motivation: Humans have long aspired to create human-like machines.
• Elizabeth Broadbent’s Perspective: AI research reflects the deep human urge to
create something in their own image.
• Terminology used in Science Goal AI:
o Computers as "smart machines," "intelligent agents," or "knowledgeable
actors."
o AI "learns" and "trains" like a human.
• Human vs AI Comparisons:
o Examples include AI oncologists vs human oncologists in diagnosing cancer.
o Media often exaggerates AI progress, e.g., headlines like "Robots Can Now
Read Better than Humans."
• Beliefs in AI Autonomy:
o AI systems can be teammates, partners, or independent entities.
o Some researchers support AI systems setting their own goals.
o Autonomy involves developing new goals beyond automation.
• Human-like Social Robots:
o "Embodied intelligence" in AI leads to anthropomorphic robots.
o Bio-inspired robots are a popular research focus.
• Legal & Ethical Considerations:
o Some scholars envision AI gaining legal rights.
o Debate on whether AI can be moral/ethical actors.
o however, priority given focuses on design issues rather than legal questions.
Innovation Goal
• Core objective: Develop AI products & services that are practical and widely
adopted.
• Often referred to as the "engineering goal."
• Key focus: Human-Centered AI (HCAI) to ensure usability and reliability.
• Metaphors in Innovation AI:
o AI as "supertools" rather than autonomous agents.
o AI products include tele-bots, active appliances, and control centers.
• Example: Airport Assistance
o Science Goal Approach: A humanoid robot that greets and assists travelers.
o Innovation Goal Approach: A smartphone app providing real-time maps,
security line wait times, and flight updates.
• User-Centric Design in Innovation AI:
o Researches user behavior and social dynamics.
o Focuses on real-world usability and adoption.
o Works with professionals on practical, impactful AI solutions.
• Balancing Human Control & Automation:
o Some tasks require full automation (e.g., airbag deployment, pacemakers).
o Others require full human control (e.g., bicycle riding, piano playing).
o Most AI systems fall in between, requiring a balance of automation &
control.
• Innovation Goal & AI Safety:
o Researchers prevent excessive automation or human control.
o Implement interlocks to prevent mistakes and failures.
o Aim for trustworthy and safe AI systems.
• Human-Computer Interaction (HCI) & Innovation AI:
o Key focus areas: Design thinking, user experience, market research,
continuous monitoring.
o User preferences: AI systems should be comprehensible, predictable,
controllable.
o Ensuring accountability: AI should support audit trails and product logs.
o Critical applications: AI in pacemakers, self-driving cars, planes must be
reliable & transparent.
• Examples of Innovation-Based AI Success:
o Speech recognition research led to Siri, Alexa, Google Home, Cortana.
o Natural language translation research improved services like Google
Translate.
o Image understanding research enabled automatic alt-text for accessibility.
• Humanoid vs. Practical AI Designs:
o Science goal focuses on bipedal humanoid robots with facial expressions.
o Innovation goal favors wheeled rovers, tele-operated drones for efficiency.
o Example: Surgical robots are actually tele-bots, controlled by human
surgeons.
• Customization in AI Innovation:
o Science goal aims for general-purpose social robots.
o Innovation goal customizes AI for specific tasks.
o Specialized AI solutions for:
▪ Nimble hand movements
▪ Heavy lifting
▪ Navigation in confined spaces
• Lewis Mumford’s View on AI Design:
o Criticism of Animism: Early AI mimicked humans/animals but was inefficient.
o Example: Wheels > Human legs for transportation.
o Better AI designs focus on functionality rather than human imitation.
• Innovation AI & Social Connectivity:
o AI enhances human collaboration (e.g., Google Docs, shared databases).
o Examples:
▪ Zoom, Webex, Microsoft Teams grew during the COVID-19 crisis.
▪ MOOCs (Khan Academy, Coursera) expanded online learning.
• Work From Home (WFH) & AI:
o AI-enabled remote work during COVID.
o Tools like Zoom supported virtual gatherings (e.g., weddings, funerals).
o Emerging platforms: Kumospace, [Link] for interactive online
meetings.
• AI & Social Media:
o AI powers social platforms like Facebook, Twitter, Weibo.
o Benefits: Community building, business growth, teamwork.
o Concerns: Privacy invasion, fake news, hate speech, surveillance capitalism.
o Potential solutions: AI moderation + human oversight.
• Human-Augmenting AI Tools:
o Orthotics (e.g., eyeglasses) enhance abilities.
o Prosthetics replace missing limbs.
o Exoskeletons amplify human strength.
• Future AI & Innovation:
o The next four topics explore:
▪ Four key pairs of metaphors guiding AI research & development.
▪ How different AI models work for various contexts.
▪ The best way to combine AI innovation with HCAI for human
benefit.
3 Explain in details Intelligent Agents and Supertools. CO L2
. 1
Soln:
Early Views on AI and Computers
• By the 1940s, modern electronic digital computers were referred to as “awesome
thinking machines” and “electronic brains.”
• Professor Dianne Martin reviewed survey data and noted concerns:
o The "awesome thinking machine" myth delayed public acceptance of
computers in the workplace.
o It also led to unrealistic expectations for easy solutions to complex social
problems.
Alan Turing and the Turing Test
• In 1950, Alan Turing published “Computing Machinery and Intelligence” asking: “Can
Machines Think?”
• Proposed the Turing Test (imitation game) to determine if machines could mimic
human intelligence.
• AI researchers took up the challenge by creating machines for chess, image
recognition, and customer support.
• Critics saw the Turing Test as a publicity stunt with poorly defined rules.
• Loebner Prize (since 1990) has rewarded programs that best simulate human
conversation.
• AI Magazine (Jan 2016 issue) discussed new forms of Turing Tests.
• Simone Natale (University of Turin) criticized the Turing Test as a banal deception,
exploiting people's willingness to accept machine-like sociality.
Man-Computer Symbiosis & AI Terminology
• In 1960, J.C.R. Licklider described “man–computer symbiosis” where:
o Computers handle routine tasks.
o Humans provide insights and decision-making.
• AI-related terms like “smart, intelligent, knowledgeable, and thinking” spread
concepts like machine learning and deep learning.
• Neuroscience metaphors (neural networks) reinforced the idea of computers
mimicking human brains.
• IBM's Watson system was initially branded as cognitive computing, but by 2020,
IBM shifted to “augmented intelligence” for clarity.
• Google’s PAIR initiative (People and AI Research) emphasizes a human-centered
approach.
Media Influence & Popular Culture
• Journalists have fueled AI hype, often portraying computers as thinking entities that
will take human jobs.
• Magazine cover stories:
o Newsweek (1980) – “Machines That Think”
o Time (1996) – “Can Machines Think?”
• Graphic artists and Hollywood reinforced AI imagery:
o Common visuals: Robot hands reaching out to humans, robotic versions of
The Thinker sculpture.
o Famous AI-based films:
▪ Sentient computers: HAL 9000 (2001: A Space Odyssey), C-3PO (Star
Wars)
▪ Terrifying AI: The Terminator, The Matrix
▪ Charming AI: Wall-E, Robot & Frank
▪ Thought-provoking AI: Her, Ex Machina
• Brian Cantwell Smith (University of Toronto) cautioned against using words like:
o Know, read, explain, understand when referring to computers.
• Despite criticism, media headlines continue to depict AI as an independent agent:
o “Machines Learn Chemistry” ([Link])
o “Hubble Accidentally Discovers a New Galaxy” (NASA)
o “AI Finds Disease-Related Genes” ([Link])
The Supertool Perspective & Human-Centered AI
• Some researchers advocate AI as a supertool that amplifies and empowers humans.
• MIT Professor Daniela Rus: “AI is nothing more than a tool with huge power to
empower us.”
• Douglas Engelbart (1968) demonstrated how AI augments human intellect.
• John Markoff’s book (Machines of Loving Grace) explores AI vs. Intelligence
Augmentation (IA).
• The 1997 debate between Pattie Maes (MIT) and the author showcased two AI
design perspectives:
o Direct Manipulation: Users should control automation via buttons, sliders,
and checkboxes.
o Software Agents: AI should proactively anticipate user needs and act on its
own.
o The debate continues today.
Design Guidelines & User Control
• Innovation goal developers prefer tool-like AI products emphasizing user control.
• Guidelines supporting user control:
o Apple Human Interface Guidelines:
▪ “User Control: people, not apps, are in control.”
▪ “Flexibility: give users complete, fine-grained control over their
work.”
o IBM, Microsoft, and US government guidelines have improved UI
consistency.
• Real-world example:
o A blind woman on a plane used a laptop with accessibility features to work
on a business report.
o She was head of accessibility for the state of Utah and confirmed that
accessibility guidelines empowered her professionally.
• Despite this, AI hype persists at AI conferences focusing on full automation (e.g.,
mammogram reading, self-driving cars).
• HCAI-focused conferences (e.g., SIGCHI, Augmented Humans, World Usability Day)
promote human-AI collaboration.
Combining AI & Human-Centered Design
• Most application developers (3M+ apps in Apple & Google Play stores) favor tool-
like AI interfaces with user control.
• Three ways to merge AI with human control:
1. AI-guided interfaces: Users receive AI-powered recommendations but stay
in control.
▪ Examples: GPS navigation, web search, e-commerce
recommendations.
2. AI-enhanced tools with human control:
▪ Example: Smartphone cameras use AI (HDR, focus, jitter removal)
but allow users to compose shots and apply filters.
3. Supertools with AI-based recommendations:
▪ Example: AI-powered coaching tools for music practice, yoga
postures, and diet recommendations.
▪ Users decide whether to follow recommendations.
Final Thoughts
• Blending intelligent agents with human-controlled supertools creates consistent,
comprehensible, predictable, and controllable AI solutions.
• AI + HCAI thinking can enhance product value and user adoption.
4 Explain in Details the Key Differences Between Humans and Computers. CO L2
. 1
Soln:
Key Differences Between Humans and Computers
Responsibility
• Computers are neither legally nor morally responsible entities.
• They are not liable or accountable, and this distinction is upheld by all legal systems.
• Boden emphasizes: “Humans, not robots, are responsible agents.”
• This principle is critical in fields like the military, where responsibility and chain of
command are essential.
• Fighter jet pilots, despite using automation, consider themselves in control and
accountable for their missions.
• Early Mercury astronauts rejected capsule designs without windows, insisting on
manual control for safety.
• Neil Armstrong manually landed the Lunar Module despite automation; it was not his
"partner."
• The Mars Rovers are not teammates but sophisticated automated systems
integrated with human tele-operation.
• The US Air Force replaced the term Unmanned Aerial Vehicles (UAVs) with
Remotely Piloted Vehicles (RPVs) to emphasize human responsibility.
• Drone pilots operating from U.S. bases bear responsibility for missions and suffer
war-related psychological trauma.
• The Canadian Government enforces strict knowledge requirements for remote
aircraft system licenses (RPAS).
• Designers and businesses recognize their legal and moral accountability for
commercial products and services.
• Commercial activities are further shaped by regulations, industry standards, and
insurance requirements.
Distinctive Capabilities
• Computers excel at:
o Sophisticated algorithms
o Huge databases
o Superhuman sensors
o Information-rich displays
o Powerful effectors
• The teammate metaphor encourages imitating human abilities instead of optimizing
these distinctive capabilities.
• A robot rescue design team attempted to translate video images into text messages
for operators, reducing efficiency instead of enhancing it.
• Better design choices should maximize computers' unique abilities rather than
forcing human-like features.
• Designers can create technologies that empower people to be significantly more
effective.
• Historical supertools (e.g., microscopes, telescopes, bulldozers, ships, planes) have
empowered users without being "teammates."
• Digital technologies such as cameras, Google Maps, and web search also empower
users without a human-like approach.
• Devices like copy machines, dishwashers, pacemakers, HVAC systems are tools, not
teammates, yet they enhance human capabilities.
Human Creativity
• Humans are the driving force behind innovation, discovery, art, and music.
• Scientific papers are authored by humans, even when computers assist in research.
• Art and music are credited to human creators, even when technology aids
composition.
• Essential human qualities—passion, empathy, humility, and intuition—are not easily
replicated by computers.
• Customization is another key aspect of human creativity.
• Users benefit from systems that allow them to fix, personalize, and provide
feedback to improve technology.
• Continuous improvement of supertools and tele-bots depends on human input and
creativity.
Limitations of the Teammate Metaphor
• Human-like robot designs often succeed only in entertainment, crash testing, and
medical training.
• This trend is unlikely to change.
• More effective designs exist for rescue robots, bomb disposal units, and pipe
inspectors.
• Many of these robots use four-wheeled or treaded vehicles, typically tele-operated
by humans.
Example: The DaVinci Surgical Robot
• DaVinci surgical robot is not a teammate but a tele-bot designed to assist surgeons.
• Enables precise surgery in small, difficult-to-reach areas.
• Lewis Mumford emphasizes that successful technologies diverge from human
forms.
• Intuitive Surgical, the creator of DaVinci systems, clearly states:
o “Robots don’t perform surgery. Your surgeon performs surgery using Da
Vinci, guiding instruments via a console.”
Fig 14.1 DaVinci Surgical System from Intuitive Surgical Source: [Link]
Robotic Devices, Tele-operation, and the Teammate Metaphor
• High Degree of Tele-operation in Robotics
o Many robotic devices are highly tele-operated, meaning an operator controls
their activities despite automation.
o Examples:
▪ Drones: Considered tele-bots; they can hover, orbit, return to their
take-off point, or follow GPS waypoints.
▪ NASA Mars Rover Vehicles:
▪ Feature a mix of tele-operated control and independent
movement.
▪ Sensors help detect obstacles and precipices, enabling
avoidance strategies.
▪ NASA’s Jet Propulsion Labs has dozens of operators
controlling various Rover systems, despite their distance of
hundreds of millions of miles.
▪ This integration of human control and automation is highly
effective.
• Tele-bots and Telepresence: Alternative Design Possibilities
o These designs enhance remote operation and precise control of devices.
o Example:
▪ Tele-pathologists: Use remote microscopes to examine tissue
samples.
o Combined designs incorporate elements of the teammate model but focus
on enhancing human capabilities via tele-operation.
• Computers as Teammates in Information Processing
o Computers act as teammates by providing access to vast databases and
superhuman sensors.
o Example:
▪ Medical Imaging:
▪ Three-dimensional echocardiograms use false color to
indicate blood flow volume.
▪ Enhances clinician confidence in making cardiac treatment
decisions.
▪ Bloomberg Terminal for Financial Data:
▪ Enables users to make bolder investment decisions.
▪ Features a specialized keyboard and multiple large displays.
▪ Windows are arranged in a spatially stable manner for ease
of use.
▪ Tiled windows prevent the need for excessive scrolling or
rearrangement.
▪ Clicking in one window updates related information in
others.
▪ Over 300,000 users pay $20,000 per year for access, valuing
its supertool functionality.
5 Discuss in detail the Importance of Assured Autonomy and Control Centers. CO L1
. 1
Soln:
Overview and Definition of Computer Autonomy
• Computer autonomy is a popular science goal among AI researchers, developers,
journalists, and promoters.
• Past descriptions of computer autonomy are shifting towards discussions about
assured autonomy, contrasting with the rising use of control centers.
• Computer autonomy typically refers to machines functioning independently without
direct human control.
• According to the US Defense Science Board:
o Autonomy comes from delegating decisions to authorized entities within
defined boundaries.
o Systems that follow prescriptive rules without deviation are automated, not
autonomous.
o Full autonomy involves the system independently selecting and composing
courses of action based on its understanding of the world, itself, and its
situation.
Concerns and Misconceptions
• The Defense Science Board warns that:
o The term “autonomy” can falsely suggest computers act without control,
especially in media and military perceptions.
o All autonomous systems still require human supervision at some level.
o The software sets boundaries on what autonomous systems can decide or
do.
o Autonomy itself does not inherently solve problems.
• This caution underscores the need for interdependence between humans and
machines within organizational and social systems.
• Since humans remain legally, morally, and ethically responsible, computers should be
designed to ensure user control.
• A combined design philosophy suggests autonomy is acceptable if it is:
o Comprehensible
o Predictable
o Controllable
o With user control over unreliable or high-importance features.
Real-World Failures of Autonomous Systems
• Despite enthusiasm, the practical application of full autonomy has led to issues:
o Autonomous financial trading systems have caused billion-dollar crashes.
o Deadly incidents include:
▪ Patriot missile system shooting down friendly aircraft during the Iraq
War.
▪ Tesla's 2016 crash while on autopilot.
▪ Boeing 737 MAX crashes in 2018–2019 caused by the MCAS system,
which acted without informing pilots.
• Robin Murphy’s law of autonomous robots states:
o Deployments often fall short of autonomy targets and worsen coordination
with human stakeholders.
Historical and Operational Challenges
• Historical commentaries warned that:
o Autonomy can increase workload due to the need for constant monitoring.
o Operators are responsible for outcomes yet uncertain about system
behavior.
• Other concerns include:
o Reduced human vigilance when there is little to do.
o Challenges in quickly taking over control when issues arise.
o Difficulty in maintaining operator skills during inactivity.
• These ironies—vigilance, rapid transitions, and deskilling—are still relevant today.
Critiques and Myths About Full Autonomy
• Bradshaw, Hoffman, Woods, and Johnson’s “Seven Deadly Myths of Autonomous
Systems” argue:
o “Smart machines” that cannot explain or justify their actions are dangerous.
o Machines that cannot respond to human intervention during failure are even
more problematic.
o Belief in full autonomy propagates harmful myths that lead to serious
misconceptions and consequences.
Autonomy vs. Human Awareness
• Mica Endsley, though supportive of autonomy, notes:
o More autonomy can reduce human situational awareness.
o Reduced awareness makes manual takeovers less effective.
o Her work suggests a supervisory control model as a realistic solution.
• Peter Hancock calls for elimination or severe restriction of autonomous devices,
fearing their inevitability and risks.
Lethal Autonomous Weapons (LAWS)
• A major ongoing debate surrounds LAWS, which can select and strike targets without
human input.
• Nearly 5,000 people support a ban similar to landmine bans.
• UN’s Convention on Certain Conventional Weapons (with reps from 125 countries) is
working on treaties to restrict LAWS.
• Cognitive scientists have documented failures and dangers of autonomous weapons.
• Some military leaders resist limitations, fearing adversarial use of such weapons.
• Progress on regulations has been slow.
Balanced and Safety-First Design
• The author supports autonomy for repetitive, dangerous, or difficult tasks, provided
it is reliable and safe.
• Personal anecdote: a self-driving car could have prevented a minor crash in a garage.
• A safety-first approach is especially important for life-critical applications.
Assured Autonomy and Supervised Models
• “Assured autonomy” was a key focus at the February 2020 Computing Research
Association workshop.
• The workshop's balanced report promoted:
o Human-centered approaches
o Formal correctness verification
o Extensive testing
o Independent certification
• It also emphasized:
o Designers should understand legal doctrines of liability and accountability.
• The author supports the report but prefers the term “supervised autonomy”:
o Involves human monitoring via control panels or remote centers.
o Enables timely intervention to ensure safe outcomes.
o Relies on audit trails and product logs for retrospective analysis.
o Used in many domains like cars, trains, ICUs, and networks.
Trustworthy Autonomy – Potential and Pitfalls
• The term “assured autonomy” is gaining institutional support:
o Johns Hopkins Institute for Assured Autonomy emphasizes:
▪ Trustworthiness based on tech reliability, human engineering, and
public policy.
o The UK has funded “Trustworthy Autonomous Systems” research.
• The author worries these terms may overpromise and mislead developers into
underestimating the need for human supervision.
Alternative Visions and Combined Control
• Terms like supervised, flexible, shared, parallel, and distributed autonomy emphasize
continued human involvement.
• Control centers can:
o Provide oversight
o Maintain situation awareness
o Show clear models of system behavior
o Offer feedback and logs for analysis
• Sheridan’s work on supervisory control clarified the spectrum between full
automation and manual operation, highlighting human accountability.
Real-World Analogies and Models
• The control center metaphor suggests:
o Humans set goals while computers perform predictable, sensor-guided
actions.
o Examples:
▪ Automatic car transmissions
▪ Social media/e-commerce systems with user feedback and alerts
• Mature systems provide:
o Clear mental models
o Interlocks, alerts, and intervention capacity.
Expanded Control Center Roles
• Modern control centers in aviation involve:
o Pilots, co-pilots, TRACON, ARTCC, FAA certification, training reviews, and
flight data analysis.
• Similar multi-level control systems exist in hospitals, transport, finance, and the
military.
Parallel Autonomy and Human Control
• MIT’s Daniela Rus supports parallel autonomy:
o Human drivers remain in control.
o Computers act only to prevent accidents.
• The “safety first” principle should extend to robots and other applications.
Final Thoughts
• While assured autonomy has appeal, control centers may offer better human
oversight.
• For high-speed autonomous actions, constant review and safety checks are essential.
• Ultimately, the terminology matters less than the design strategy:
o Reliability, safety, testing
o User feedback
o Audit trails
Transparency in reporting issues
6 Discuss in detail with an example the Social Robots and Active Appliances. CO L2
. 1
Soln:
Metaphors of Social Robots vs. Active Appliances
• The fourth pair of metaphors contrasts social (humanoid) robots with everyday
appliances.
• Social robots are often referred to as humanoid, anthropomorphic, or android due to
their human-like forms.
• In contrast, common appliances include kitchen stoves, dishwashers, and coffee
makers.
• Additional appliances include clothes washers, dryers, security systems, baby
monitors, and HVAC systems.
• Homeowners also use outdoor active appliances or telebots like garden waterers,
lawn mowers, and pool cleaners.
• These are called active appliances due to their sensors, programmable actions,
mobility, and effectors.
• Active appliances can act on their own based on time or environment triggers (e.g.,
temperature changes, baby crying, or intruder detection).
History of Social Robots
• The concept of animated human-like robots dates back to ancient Greece.
• One notable example is from the 1770s: Swiss watchmaker Pierre Jaquet-Droz’s
mechanical devices.
• These included The Writer (used a quill), The Musician (played piano), and The
Draughtsman (drew pictures).
• These machines became museum pieces in Neufchâtel, Switzerland’s Art and History
Museum.
• Meanwhile, mechanical innovations like music boxes, clocks, and flour mills found
practical success.
Cultural Influence and Mythology
• Classic stories like the Golem (Prague rabbi) and Mary Shelley’s Frankenstein shaped
the idea of man-made beings.
• Children’s stories include Geppetto’s Pinocchio and Tootle the Train, who defied
instructions.
• Goethe’s “Sorcerer’s Apprentice” tells of an autonomous broomstick that flooded a
workshop and couldn’t be stopped.
• These tales illustrate the unpredictable dangers of animated creations.
• Modern language around animated robots traces back to Karel Čapek’s 1920 play
Rossum’s Universal Robots.
Features and Appeal of Social Robots
• Social robots often have human characteristics: limbs, a face, eyes, mouth, and ears.
• They mimic human behaviors: expressions, gestures, speech, emotion, and
personality.
• Their autonomy adds entertainment value beyond traditional puppets.
• Children and adults are fascinated by robots in films, as toys, or DIY kits.
• However, practical use in innovation is limited, except in crash test dummies and
medical mannequins.
Limitations of Human-Like Designs
• Early robot arms mimicked human limbs—16 inches long with five fingers and limited
rotation.
• These designs lifted only up to 20 pounds, restricting their utility.
• Industrial needs led to more capable robot arms with multiple joints and high
strength.
• Modern robot hands may use grippers or suction cups instead of fingers.
• This evolution away from human-like forms was predicted by Lewis Mumford.
Failed Attempts at Social Robot Deployment
• The US Postal Service introduced the Postal Buddy in 1993 intending to deploy
10,000 units.
• The project ended after public rejection of the 183 units that were installed.
• Anthropomorphic bank teller designs (e.g., Tillie the Teller) were also abandoned.
• Banks moved toward task-focused systems like "automatic transaction machines."
• Voice interactions were rejected due to privacy concerns in public settings.
Failures of Digital Avatars
• Clifford Nass’s consulting contributed to Microsoft’s failed 1995 BOB with friendly
onscreen characters.
• The project gained attention but was cancelled within a year.
• Microsoft’s Clippy (Office 1997) was criticized for being intrusive and distracting.
• Ananova, a web-based news avatar from 2000, ended within months.
• China’s Xinhua revived the idea in 2018, improving it in 2020.
Other Avatars and Their Fates
• Apple’s Ken the Butler in the 1987 Knowledge Navigator and intelligent tutoring
avatars also disappeared.
• These avatars distracted users from their main tasks.
• Honda’s humanoid robot Asimo gained media attention but was discontinued in
2018.
• David Hanson’s robot Sophia gained Saudi citizenship but had no commercial
success.
• The company pivoted to Little Sophia, a smaller robot for education and
entertainment.
• Little Sophia aims to teach STEM, coding, and AI in a fun, engaging way for kids.
Social Robot Startups
• MIT’s Cynthia Breazeal promoted emotive robots (e.g., Kismet), leading to the
startup Jibo.
• Jibo closed in 2019.
• Other companies like Anki (makers of Cozmo and Vector) and Mayfield Robotics
(Kuri) also closed in 2019.
• These startups had happy users but failed to build a sustainable market.
Expert Opinions on Social Robots
• Cornell’s Guy Hoffman said that top social robot companies failed to find viable
business models.
• Hoffman believes artists and designers could still create emotionally compelling
models.
• His colleague Malte Jung doubts the practicality of anthropomorphic robot designs.
• Jung sees anthropomorphic forms as overly complex and likely to disappoint.
• He suggests HRI (human–robot interaction) techniques could improve car-like robot
systems.
Shifting Toward Functional Robotics
• Some companies transitioned from humanoid designs to practical robotic
applications.
• Boston Dynamics moved from humanoid robots to wheeled robots with vacuum
suction for warehouse logistics.
• Boston Dynamics was acquired by Hyundai in 2020.
• This acquisition valued the company at over $1 billion, showing market confidence in
practical robot tech.
Fig 16.1 Mobile robot for moving boxes in a warehouse from Boston Dynamics. Source: Handle™ robot image provided
courtesy of Boston Dynamics, Inc.
SoftBank and Human-like Robots
• SoftBank, a Japanese IT and investor giant, was part owner of Boston Dynamics and
is focused on products that are “smart and fun!”
• In 2012, SoftBank acquired French company Aldebaran Robotics.
• In 2014, SoftBank introduced the Pepper robot, which stands four feet tall, with
expressive head, arm, and hand movements, and a three-wheeled base for mobility.
• Pepper is claimed to be “optimized for human interaction” and can engage through
conversation and its touch screen.
• Its design and conversational ability attracted strong interest, resulting in sales of
over 20,000 units.
• Pepper is marketed for roles like welcoming customers, giving product information,
guiding exhibits or stores, and conducting satisfaction surveys.
• A 10-week study in a German elder care home found that older adults enjoyed using
Pepper for physical training and gaming.
• However, the same adults emphasized they did not want robots to replace human
caregivers.
• During the COVID-19 pandemic, Pepper robots wore masks and reminded shopping
mall customers to wear theirs.
• In 2015, SoftBank acquired the 2-foot-high NAO robot from Aldebaran Robotics.
• NAO is more advanced and expensive than Pepper but has sold over 5,000 units for
use in healthcare, retail, tourism, and education.
• Despite interest, SoftBank shut down Pepper production in June 2021, signaling
instability in the social robotics market.
Robot Reception in Japan
• Japan is often seen as highly receptive to gadgets and robots.
• A robot-staffed hotel in Japan closed in 2019 after only a few months of operation.
• The hotel had robot cleaners and front-desk staff, including crocodile-like
receptionists.
• The company president stated that real-world use revealed robots can be
unnecessary or even annoying in some roles.
• In contrast, traditional vending machines for drinks and snacks remain very popular
in Japan and worldwide.
• These machines have evolved to include heating and cooling features for food and
drink dispensing.
Social Robots in Autism Therapy
• The use of social robots in autism therapy is controversial.
• Some studies show benefits for children with difficulty forming human relationships.
• These studies suggest robots can help children on the autism spectrum feel more
comfortable, potentially improving future human interaction.
• Critics argue that focusing on the technology rather than the child can lead to early
progress but poor long-term outcomes.
• Neil McBride of De Montfort University contends that viewing humans as more than
machines makes it unethical to assign therapeutic responsibilities to robotic toys.
• However, play therapy involving dolls and puppets may evolve to include robotic
characters.
Debates on Human-like Social Robots
• The divide between supporters and skeptics of human-like robots is growing.
• David Watson of the Oxford Internet Institute warns against relying on
anthropomorphic portrayals of robots.
• He considers this tendency misleading at best and dangerous at worst.
• Watson also highlights ethical concerns, especially when algorithms are given
decision-making power in sensitive social areas.
• This could hinder society’s ability to hold individuals and groups accountable for
actions mediated by technology.
• Nevertheless, supporters of human-like social robots remain optimistic about
successful designs and market potential.
Fig 16.2 Dr. Takanori Shibata holding his creation, PARO, a robot therapy device, in June 2018
Voice and Text User Interfaces (VUIs & TUIs)
• Voice assistants (e.g., Alexa, Siri) are widely accepted due to strong speech
recognition and natural responses.
• Devices avoid humanoid forms but are successful in homes.
• Common uses: playing music, info search, controlling smart devices.
• Treated as tools, not companions; users sometimes test their responses.
• Voice dictation helps those with disabilities or busy hands (e.g., doctors).
• Phone-based VUIs work well, but accents/speech issues remain challenges.
• Voice readers aid visually impaired and are popular for multitasking.
• Voice is faster than typing but can interrupt, is temporary, and less informative than
visuals.
• Speaking uses cognitive resources, so voice UIs aren’t ideal in high-pressure tasks
(e.g., flying).
• Talking dolls (e.g., Barbie) haven’t succeeded commercially.
• Text-based chatbots are popular in customer service, designed with polite, friendly
language.
• Some chatbot failures (e.g., Tay), while others succeeded (e.g., Xiaoice, Replika,
Woebot).
• Mental health bots like Woebot show promise but need further validation.
• Effective chatbots must provide real value and support meaningful dialog.
The Future of Social Robots
• Social robot adoption is low, but research and development continue.
• Studies show mixed reactions—some prefer tool-like, user-controlled designs.
• Concerns exist about robophobia and the uncanny valley.
• People respond socially to robots, but effectiveness of human-like design is
questioned.
• Functional, non-human machines (e.g., ATMs) have been more successful.
• Designers should prioritize usability, transparency, and user control.
• History shows deceptive humanoid designs can erode user trust.
• Experts argue humanoid robots aren’t necessary and may be misleading.
• Social robots may help in elder care or disaster relief—but simpler tools often work
better.
• Rescue robots succeed via tele-operation, not human-like behavior.
• Innovation should focus on practical, accessible solutions (e.g., integrated
dishwashers).
• Combining voice assistants with appliances and therapy/chatbots shows potential.
• Long-term success depends on usefulness, simplicity, and real user needs.
Examples of Robots:
Fig 16.3 SONY AIBO robot [Link]: photo by Ben Shneiderman
Fig 16.8 Google Nest Learning Thermostat. Source: [Link]
Fig 16.9 Roomba 700 series robotic vacuum cleaner sold by iRobot.
(a) (b)
Fig 16.10 Roomba home screen and generated apartment map with room labels supplied by the author.
Fig 16.11 Replika chatbot with discussion session. Source: [Link]
Module 4
1 Explain with the neat figure the Human-Centered AI Principles. CO L2
. 1
Soln:
A 2020 Berkman Klein Center report documents the surge in policy activity.
It provides a summary of 36 leading and comprehensive reports.
The authors of the report highlight eight key HCAI themes for further
discussion:
• Privacy
• Accountability
• Safety and security
• Transparency and explainability
• Fairness and non-discrimination
• Human control of technology
• Professional responsibility
• Promotion of human values
Other reports stress ethics, such as IEEE’s “Ethically Aligned Design.”
This report is the result of a 3-year effort involving over 200 contributors.
It outlines eight general principles:
• Human rights
• Well-being
• Data agency
• Effectiveness
• Transparency
• Accountability
• Awareness of misuse
• Competence
The IEEE report strongly advocates that advanced systems should be created
and operated to respect, promote, and protect internationally recognized human
rights.
Figure 18.1 in the source material shows how closely aligned and similar the
principles are in the two reports.
Ethical principles serve as important foundations for clear thinking.
2 Explain with the neat figure the Governance Structures for Human-Centered AI. CO L2
. 1
Soln:
Alan Winfield (University of Bristol) and Marina Jirotka (Oxford University)
emphasize that “the gap between principles and practice is an important theme.”
A four-layer governance structure for HCAI systems may help bridge this
gap:
1. Reliable systems grounded in sound software engineering practices
2. Safety culture supported by proven business management strategies
3. Trustworthy certification provided by independent oversight
4. Regulation enforced by government agencies (as shown in Figure 18.2)
The inner oval represents software engineering teams that apply technical
practices tailored to their specific projects.
These teams operate within a larger organization (second oval) where safety
culture and management strategies influence project execution.
The third oval consists of independent oversight boards that monitor
multiple organizations in the same industry, gaining deeper insight and
promoting successful practices across the sector.
The largest oval represents government regulation, which adds a broader
perspective focused on the public’s interest in reliable, safe, and trustworthy
HCAI systems.
Government regulation is often controversial, but there are success stories
that show its value:
• The US National Transportation Safety Board’s investigations of transport
accidents have generally advanced public interests.
• The European General Data Protection Regulation (GDPR) led to
significant research and innovation in explainable AI.
• US regulations on automobile safety and fuel efficiency spurred
improvements in design research.
Reliability, safety, and trustworthiness are essential concepts for all
involved in technology development, whether AI-based or not.
Additional critical concerns at every level of governance include:
• Privacy
• Security
• Environmental protection
• Social justice
• Human rights
Corporations often make positive public statements about benefiting
customers and employees.
However, when faced with difficult choices involving power and money,
business leaders may prioritize corporate interests and shareholder expectations.
Current movements for human rights and corporate social responsibility
help build public support.
But these efforts are optional for most managers and lack enforcement.
Required processes involving:
• Software engineers
• Managers
• External reviewers
• Government agencies
—guided by clear principles and transparent corporate reporting—will
be more effective.
Internal and external review boards play a key role, especially in emerging
technologies like HCAI.
Public pressure may force corporate managers to recognize their societal
responsibilities and report publicly on progress.
Government policy-makers also need better understanding of HCAI
technologies and how corporate decisions impact the public interest.
While legislation by Congress or Parliament shapes industry practices,
government agency staff must make nuanced decisions about law enforcement.
Professional societies and NGOs are actively trying to educate and inform
government officials.
The proposed governance structures are grounded in existing practices
and offer practical steps for adaptation to new HCAI technologies.
They aim to define who takes action and who is responsible.
For successful implementation, these recommendations require:
• Budget allocation
• Scheduling
• Pilot testing
• Research to assess effectiveness
These governance structures are just a starting point.
New approaches will be necessary as technologies evolve or as market
forces and public opinion reshape successful products and services.
For example, in 2020, public opinion caused a dramatic shift in the use of
facial recognition technologies.
Major tech companies like IBM, Amazon, and Microsoft stopped selling
these systems to police departments due to concerns about potential misuse and
abuse.
3 Discuss the steps involved in Microsoft’s nine-stage software engineering workflow CO L2
. for machine learning projects. 1
Soln:
Microsoft offers a nine-stage software engineering workflow for HCAI (Human-
Centered AI) systems.
• This workflow appears closer to a linear waterfall model than to agile methods.
• However, descriptions suggest that more agile processes are used in its actual
execution.
Microsoft emphasizes maintaining strong connections with customers throughout
their workflow.
Interviews with fourteen developers and managers revealed key priorities:
• Emphasis on data collection, cleaning, and labeling.
• Followed by model training, evaluation, and deployment.
The report distinguishes between software engineering and machine learning:
• “Software engineering is primarily about the code that forms shipping
software.”
• “ML (machine learning) is all about the data that powers learning models.”
HCAI system developers may use the waterfall model, but:
• Agile methods are more common.
• Agile methods help promote early engagement with clients and users.
HCAI projects differ from traditional programming projects:
• ML training data sets play a much stronger role in HCAI projects.
Traditional software testing methods (e.g., static code analysis) are insufficient
alone:
• Must be supplemented with dynamic testing using multiple data sets.
• Testing should also assess reliability across different usage contexts.
• User experience testing is necessary to ensure users can complete tasks
effectively.
User experience testing in HCAI systems should also address user perceptions:
• Important to help users understand how the ML system guides their actions.
• Users need enough understanding to decide whether to challenge outcomes.
Understandability is moderately important in systems like recommenders.
• However, it is crucial in systems involving consequential decisions (e.g.,
mortgages, parole).
Understandability is vital for acceptance and effective use in life-critical systems:
• Especially those used in medical, transportation, and military applications.
4 Discuss the methods involved in Verification and Validation Testing.
.
Solu:
Verification and Validation Testing in HCAI Systems
• Need for Novel Testing Approaches
o AI and ML in HCAI systems require new processes for algorithm
verification and validation.
o User experience testing with typical users is essential.
o Goal: Ensure HCAI systems meet user expectations and reduce harmful
outcomes.
• Civil Aviation as a Model
o Provides frameworks for:
▪ Certification of new designs.
▪ Verification and validation during use.
▪ Certification testing for users (e.g., pilots).
• US National Security Commission on AI
o Stresses the importance of iterative testing, evaluation, verification, and
validation incorporating user feedback.
• Testing for Different ML Types (by Jie M. Zhang et al.)
o Supervised Learning: Uses labeled training data.
o Unsupervised Learning: Uses unlabeled data to understand data
patterns.
o Reinforcement Learning: Based on sequences of actions,
observations, and rewards.
• Training Data Importance
o ML performance heavily depends on training data.
o Data must reflect diverse and relevant contexts to increase accuracy and
reduce biases.
o Example: Hospital-specific data required for cancer detection due to
demographic variations.
• Validation Case Example
o Pneumonia detection AI system showed varied results across hospitals.
o Influencing factors: X-ray machine differences, patient characteristics,
and equipment positioning.
• Data Documentation Challenges
o Need for continuous and updated documentation of multiple datasets.
o Traditional repositories (e.g., GitHub) insufficient for tracking changes
in datasets.
o Promising tools:
▪ Blockchain for provenance tracking.
▪ Methods for ensuring data representativeness over time.
▪ Clear designation of data curation responsibilities to include
human oversight.
Popular Testing Techniques for HCAI Systems
• Traditional Case-Based Testing
o Involves matching input values with expected outputs.
o Example use: mortgage application approvals, animal image
classification, sentence translation.
o Benefits:
▪ Helps identify extreme cases and system failures.
▪ Encourages inclusion of both developers and non-developers in
test case design.
o Established datasets available:
▪ ImageNet (14 million labeled images).
▪ NIST TREC (text retrieval evaluation data sets).
• Adversarial Testing
o A critical part of verification.
o Detects vulnerabilities that could be exploited by malicious actors.
o Requires continuous update of test cases to match evolving
requirements and usage contexts.
• Differential Testing
o Compares outputs from current and previous system versions.
o Automatically identifies discrepancies.
o Pros:
▪ No need for predefined expected results.
o Cons:
▪ Past errors may persist.
▪ New features can't be directly compared.
o Also used to compare different systems or training data sets for ML
models.
• Metamorphic Testing
o Based on logical relationships among outputs.
o Examples:
▪ Pathfinding algorithms: path from A to B = path from B to A.
▪ Mortgage approval should not fail for lower amounts if
approved for higher.
▪ Price filters in recommender systems should return subsets
when price is lowered.
o No expected results needed; relations inferred from logic and system
behavior.
• User Experience Testing
o Essential for user-facing HCAI applications (e.g., mortgages, parole,
job interviews).
o Involves giving tasks to users who think aloud while completing them.
o Sessions typically last 30–120 minutes.
o Focus: Capture comments, actions, and identify usability issues.
o Practical for development; distinct from controlled experiments aimed
at statistical validation.
• Red Team Testing
o External teams simulate attacks on HCAI systems.
o Originated in military and cybersecurity; now applied to AI systems.
o Examples of red team strategies:
▪ Misleading facial recognition with altered appearances.
▪ Disrupting self-driving systems with manipulated signage or
patterns.
o MITRE Corporation’s ATT&CK matrix:
▪ Catalogs ~300 attack tactics.
▪ Guides developers on possible vulnerabilities.
o Suggestion: Develop a similar matrix specifically for HCAI system
vulnerabilities.
Testing Documentation and Professionalization
• Recording Test History
o Important for traceability and accountability.
o Helps reconstruct problems and identify who made which repairs.
• Datasheets for Datasets (Microsoft)
o Standardizes documentation of:
▪ Data collection.
▪ Cleaning methods.
▪ Usage and contacts.
• Model Cards (Google)
o Templates for documenting model details and behaviors.
• Provenance and Testing Histories
o Learnings from databases and visualization research aid in tracking and
documentation.
o Shift towards mature and professional software engineering practices.
Special Considerations for Robotic and Sensitive Applications
• High-Risk Scenarios
o Mobile robots, weapons, and medical devices need rigorous safety
testing.
o Metrics include:
▪ Safe operation.
▪ Task completion time and success.
▪ Quality and quantity of output.
• Established Sectors as Models
o Aviation, healthcare, and automotive industries provide benchmarks
and certification processes.
• Bias Testing
o Beyond technical accuracy, HCAI systems must be fair.
o Required especially in applications impacting people's lives.
5 Discuss the methods involved in the Bias Testing to Enhance Fairness. CO L2
. 1
Soln:
Bias Testing to Enhance Fairness:
• Concerns in AI applications:
o As AI is used in consequential decisions (e.g., parole, loans, hiring),
critics like Cathy O’Neil highlight risks.
o Her book Weapons of Math Destruction identifies 3 dangers of
algorithms:
▪ Opacity: Algorithms are hidden, making them hard to
challenge.
▪ Scale: Widespread use by major institutions amplifies impact.
▪ Harm: They can result in unfair treatment with life
consequences.
• Community and commercial response:
o Conferences like Fairness, Accountability, and Transparency in ML
arose to tackle biases (gender, racial, age, etc.).
o Biases exposed in real-world AI deployments (e.g., biased parole
decisions, hate chatbots, hiring tools).
o Commercial practices began to shift in response.
• Types of bias (Friedman & Nissenbaum):
o Pre-existing bias: Rooted in social attitudes (e.g., housing
discrimination).
o Technical bias: Arises from software/hardware design constraints (e.g.,
organ donor list UI).
o Emergent bias: From changing context of use (e.g., education software
across cultures).
• Additional biases (Baeza-Yates):
o Identified geography, language, and culture as embedded biases.
o Warned that "bias begets bias" (popular sites gain more visibility).
• Ethics frameworks and expanded understanding:
o IEEE’s Ethically Aligned Design aims for an ethical AI foundation.
o USC review expands bias types (statistical, user interaction, funding,
etc.) and offers mitigation strategies.
• Bias and disability:
o Meredith Ringel Morris notes AI's mixed effects on disabled users.
o Suggests inclusive training data and guarding against exploitation of
cognitive disabilities.
• Healthcare disparities:
o Algorithmic bias results in unequal care (e.g., black patients receive
less support).
o Addressing bias could significantly improve outcomes for marginalized
groups.
• Gender bias in computing:
o Underrepresentation of women affects design and fairness.
o Bias in hiring, education, and services continues to be a concern.
• Bias testing strategies:
o Start with testing for data representativeness and known historical
biases.
o Use mitigation techniques (e.g., removing demographic attributes that
influence decisions).
o Commercial toolkits like IBM’s Fairness 360 assist with bias detection.
• Assigning responsibility:
o Appoint a bias testing leader to oversee data/program assessments and
respond to issues.
o Maintain a library of test cases for bias validation.
o Encourage external oversight to combat team bias-blindness.
• Persistent issues and public exposure:
o Simple bias tests help, but issues like intersectional bias (e.g., black
women) remain.
o Publishing and media exposure pressure companies to improve
systems.
• Activism and impact:
o Joy Buolamwini and Timnit Gebru exposed facial recognition bias.
o Their work influenced corporate changes and was featured in the
documentary Coded Bias.
• Controversy and systemic bias:
o Gebru's firing from Google sparked industry-wide dialogue on bias and
diversity.
o Algorithmic bias is symptomatic of broader systemic inequities.
• Real-world bias examples:
o Google Image searches show racial bias in terms like “professional
hair” vs. “unprofessional hair.”
o These biases reflect and perpetuate harmful stereotypes.
(a)
(b)
Fig 19.2 (a) Google Search for “Professional hair” shows mostly light-skinned women.
(b) Google Search for “Unprofessional hair” shows mostly dark-skinned women.
• Indigenous perspectives and cultural relevance:
o Indigenous communities emphasize local context, culture, and kinship
in tech use.
o They advocate for culturally grounded approaches to AI.
o D. Fox Harrell supports culturally aware innovation to reduce bias.
6 Explain with the examples the Internal Review Boards for Problems and Future Plans.
.
Soln:
Internal Review Boards for Problems and Future Plans:
Fostering a Safety Culture
• Monthly meetings focus on:
o Discussing failures and near-misses.
o Celebrating resilient responses to challenges.
• Statistical reporting standardizes data to:
o Help understand important metrics.
o Encourage suggestions for new metrics.
• Summaries (internal or public) emphasize:
o The importance of a safety-focused organizational culture.
Role of Review Boards
• Composed of managers, staff, and other stakeholders.
• Encourage diverse perspectives for continuous improvement.
• Public performance reporting examples:
o Aviation: On-time performance and lost baggage rates.
o Hospitals: Patient care results by condition or procedure.
• AHRQ (Agency for Healthcare Research and Quality):
o Conducts safety culture surveys across hospitals, nursing homes,
pharmacies.
o Raises awareness and tracks trends regionally and over time.
Disclosure, Apology, and Offer Programs (Healthcare)
• Shift from fear of lawsuits to:
o Full disclosure of errors.
o Clear apology to patients/families.
o Offer of remedial treatment or financial compensation.
• Results of implementation:
o Reduced malpractice lawsuits (often by half).
o Fewer medical errors due to increased awareness.
o Improved professional and organizational pride.
HCAI Auditing and Internal Review Example: Google’s 5-Stage Framework
1. Scoping: Define project/audit scope and identify risks.
2. Mapping: Stakeholder mapping, interviews, and metric selection.
3. Artifact Collection: Document design, data, and ML models.
4. Testing: Adversarial testing to find edge cases and potential failures.
5. Reflection: Analyze risks, plan for remediation, document design history.
• Post-audit practices:
o Self-assessment summary report.
o Mechanisms to track implementation of audit outcomes.
• Emphasized point: Internal audits are only one part of a broader
accountability system.
Corporate Review and Ethics Boards
• Facebook Oversight Board (launched mid-2020):
o Oversees content monitoring and platform governance.
• Microsoft’s AETHER Committee:
o Advises on responsible AI, best practices, and trusted technology.
• Microsoft’s Office of Responsible AI:
o Develops company-wide governance and readiness rules.
o Handles sensitive use cases.
o Participates in shaping HCAI laws, norms, and standards for societal
benefit.
7 Explain with a neat figure the software quality improvement method using CO L2
. Characteristics of the Maturity levels. 2
Soln:
Capability Maturity Models and Documentation Practices to improve HCAI
(Human-Centered AI) systems:
Capability Maturity Model Integration (CMMI)
• Developed by Software Engineering Institute (SEI) in the late 1980s;
regularly updated.
• Focus: Improve software development processes (not product standards).
• The 2018 CMMI version aims to:
o Integrate organizational functions.
o Set process improvement goals and priorities.
o Provide guidance for quality and reference points for appraisals.
• Five maturity levels:
1. Level 1: Unpredictable, poorly controlled, reactive processes.
2. Level 2-5: Increasing process standardization, control, metrics, and proactive
optimization.
• Used in U.S. government contracts, especially for defense, requiring a
specific maturity level.
• Emphasizes training, metrics, and organization-wide performance
optimization.
Challenges and Emerging Models
• Critics argue CMMI can lead to bureaucratic overhead, potentially slowing
agile/lean methods.
• Still, there is rising interest in HCAI-specific maturity models for:
o Medical devices
o Transportation
o Cybersecurity
Machine Learning Maturity and Trustworthiness Models
• UK Institute for Ethical AI and Machine Learning: Proposes a model with
hundreds of benchmarks (e.g., data/model assessment, explainability).
• Proposal: Transform HCAI Maturity Models into Trustworthiness
Maturity Models (TMMs):
o Level 1: Initial, team-driven, unpredictable use.
o Level 2: Uniform training and tools for consistency.
o Level 3: Regular review/refinement of tools/processes.
o Level 4: Measurable performance, audit trail analysis.
o Level 5: Continuous improvement across time and teams.
• TMM assessments cover:
o Biased data testing
o System validation/verification
o User experience and performance testing
o Customer complaint reviews
Call for Responsible AI Development
• Workshop report with 59 co-authors:
o Argues for moving from principles to mechanisms of trustworthiness.
o Recommends structures/workflows for software and hardware.
o Supports “verifiable, falsifiable claims” to demonstrate responsible
AI.
Documentation Templates for HCAI Systems
• Datasheets for Datasets (2018):
o Standard format for documenting dataset creation, use cases, and
ethical concerns.
o Sparked tools like:
▪ Google’s Model Cards
▪ Microsoft’s Datasheets
▪ IBM’s FactSheets
• IBM’s FactSheets:
o Evaluated by 35 participants across six systems (e.g., breast cancer
detector).
o Assessed on completeness, clarity, conciseness.
o Refinement based on a second user study.
• Explainability Fact Sheets (Sokol & Flach, University of Bristol):
o Framework for documenting features of a system’s explanations.
• Goal: Mature documentation practices to improve reliability, safety, and
trust in AI systems.
8 Explain in detail Three independent oversight methods used in Trustworth CO L2
. Certification. 2
Soln:
Independent Oversight Methods for Human-Centered AI (HCAI) systems:
Three Common Independent Oversight Methods (Figure 21.1)
1. Planning Oversight
• Purpose: Review proposals for new HCAI systems or major upgrades before
implementation.
• Goal: Gather feedback, influence design early, and ensure plans follow ethical
and legal standards.
• Comparison: Similar to zoning boards for new building approvals.
• Variation: Algorithmic Impact Assessments (AIAs)
o Modeled after Environmental Impact Statements.
o Allow stakeholder input prior to system deployment.
• Requirement: Needs follow-up reviews to verify that the approved plan was
implemented properly.
2. Continuous Monitoring
• Definition: Ongoing oversight through in-place inspectors or regular audits.
• Examples:
o FDA: Monitors pharmaceutical and meat-packing plants.
o Federal Reserve Board: Oversees large banks continuously.
o Elevators: Quarterly inspections.
o Public companies: Annual financial audits.
• In HCAI:
o Useful for dynamic systems like mortgage approval or parole
decisions.
o Critical during contextual shifts, e.g., COVID-19, where applicant
profiles or system conditions change rapidly.
• Challenge: High cost and resource intensity.
3. Retrospective Analysis
• After-the-fact reviews of failures or disasters.
• Example:
o National Transportation Safety Board (NTSB): Thorough
investigations of crashes (planes, trains, ships).
• Emerging Practices:
o FCC: Reviewing HCAI in social media, especially for accessibility
and misinformation.
o Other agencies working on:
▪ Principles and policies for HCAI failure analysis.
▪ Voluntary audit trail guidelines for post-event review and
accountability.
General Considerations
• Skepticism exists around oversight due to:
o Failures of independence
o Inadequate enforcement power
• Nonetheless, these methods are widely valued for ensuring safety,
accountability, and public trust in HCAI.
Subject In-charge HoD