Install the Netflix Sample Database in PostgreSQL

The Netflix sample database is a learning database based on public data from Netflix’s Engagement Report and Global Top 10 lists. This guide walks you through installing it on PostgreSQL.

For more background on this database, see my introduction to the Netflix Sample Database.

1. Download the Script

Head to the Netflix Sample Database GitHub repository and download the PostgreSQL installation package. The actual script is on the releases page at:

https://github.com/lerocha/netflixdb/releases

You’ll get a .zip file. Extract the .zip file to get the SQL script for PostgreSQL (which will be called something like netflixdb-postgres.sql). This .sql file is what we’ll use to set up the sample database.

2. Create Your Database

The script doesn’t create the database (it simply creates the tables and populates them with data), so you’ll need to do that first. Connect to your PostgreSQL server and create a new database:

CREATE DATABASE netflixdb;

Then connect to the new database:

\c netflixdb

If you’re using a GUI tool, you can create the database through the interface and then connect to it.

3. Run the Installation Script

Now that you’ve connected to the newly created database, you can go ahead and run the SQL script to create the tables and load the data.

If you’re still in psql:

\i path/to/netflixdb-postgres.sql

If you’re using psql from the command line:

psql -d netflixdb -f path/to/netflixdb-postgres.sql

If you get an error telling you that the file doesn’t exist, check that you’re using the correct file name. As mentioned, the one I downloaded is called netflixdb-postgres.sql (not to be confused with netflixdb-postgresql.sql). So, the file uses postgres instead of postgresql (the ql part is mssing).

Using pgAdmin or another GUI:

Open the query tool, load the SQL script file, and execute it. Here’s a sneak peak at the top of the script:

Screenshot of the script for creating the Netflix sample DB in PostgreSQL

The script creates all tables, sets up relationships, and loads the data. This typically takes a few minutes.

4. Verify the Installation

Check that the tables were created:

SELECT table_name 
FROM information_schema.tables 
WHERE table_schema = 'public'
ORDER BY table_name;

You should see:

  table_name  
--------------
movie
season
tv_show
view_summary

You can check the row counts:

SELECT 
    'movie' as table_name, 
    COUNT(*) as row_count 
FROM movie
UNION ALL
SELECT 'season', COUNT(*) FROM season
UNION ALL
SELECT 'tv_show', COUNT(*) FROM tv_show
UNION ALL
SELECT 'view_summary', COUNT(*) FROM view_summary;

If you see data in these tables, the installation was successful. Here’s the result that I got:

  table_name  | row_count 
--------------+-----------
movie | 11922
season | 8555
tv_show | 4705
view_summary | 37225

Explore the Schema

Here’s an ER diagram of the database that I generated in pgAdmin:

ER diagram of the Netflix sample database in PostgreSQL

You can use the following query to view the structure of the tables:

-- View columns for a specific table
SELECT 
    column_name,
    data_type,
    character_maximum_length,
    is_nullable
FROM 
    information_schema.columns
WHERE 
    table_name = 'movie'
ORDER BY 
    ordinal_position;

Run this for each table to see what columns are available.

Sample Queries

Here are a few queries to verify the database is working:

Check for specific content:

SELECT title 
FROM movie 
WHERE title ILIKE '%stranger%'
LIMIT 5;

Join movies with viewing data:

SELECT m.title, vs.* 
FROM movie m
JOIN view_summary vs ON m.id = vs.movie_id
LIMIT 5;

As mentioned the Netflix sample database repository is updated regularly with new data from Netflix’s public reports. You can download fresh versions of the database as needed to work with the most current information.