What is a Materialized View?

A materialized view is a database object that stores the results of a query physically on disk, rather than computing them on the fly every time you need them. It’s basically a snapshot of your query results that you can refresh periodically. Unlike regular views (which are just saved queries that execute each time you use them), materialized views pre-compute and cache the data, making subsequent reads much faster.

If you have a complex query that joins multiple tables, performs aggregations, or does heavy calculations, running it every time someone needs the data can be slow and resource-intensive. A materialized view runs that query once, saves the results, and lets you query those saved results directly.

Why Use Materialized Views?

The main reason you might want to use a materialized view is performance. When you’re dealing with queries that take seconds or even minutes to run, converting them to materialized views can reduce response times to milliseconds. This can be especially valuable for:

  • Reporting and analytics: If you have dashboards that aggregate sales data across millions of rows, materializing those aggregations means your dashboards load instantly instead of making users wait while calculations run.
  • Complex joins: Queries that join five or six tables with various conditions can be expensive. Materializing the results eliminates the join overhead on every query.
  • Expensive calculations: Any query doing heavy mathematical operations, window functions, or recursive queries can benefit from pre-computation.

The trade-off is freshness. Since you’re working with stored results, the data in a materialized view is only as current as the last time you refreshed it. If your underlying tables change, those changes won’t appear in the materialized view until you refresh it. This makes materialized views more suitable for data that doesn’t need to be up-to-the-second accurate, like daily sales reports or monthly analytics.

How Materialized Views Work

When you create a materialized view, the database executes your query and stores the result set as a physical table. From that point on, when you query the materialized view, you’re reading from that stored result set rather than executing the original complex query.

Let’s look at a simple example using PostgreSQL.

We’ll create and populate two tables:

-- Create sample tables
CREATE TABLE books (
    book_id SERIAL PRIMARY KEY,
    title TEXT,
    author TEXT,
    price DECIMAL(10,2)
);

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    book_id INTEGER REFERENCES books(book_id),
    quantity INTEGER,
    order_date DATE
);

-- Insert sample data
INSERT INTO books (title, author, price) VALUES
    ('The Midnight Garden', 'Sarah Chen', 24.99),
    ('Quantum Dreams', 'Marcus Webb', 29.99),
    ('Desert Echoes', 'Sarah Chen', 19.99),
    ('The Algorithm', 'Marcus Webb', 34.99),
    ('River Songs', 'Aisha Patel', 22.99);

INSERT INTO orders (book_id, quantity, order_date) VALUES
    (1, 3, '2024-10-15'),
    (2, 1, '2024-10-16'),
    (1, 2, '2024-10-18'),
    (3, 4, '2024-10-20'),
    (4, 1, '2024-10-22'),
    (5, 2, '2024-10-23');

This database is for an online bookstore.

Let’s say you frequently need to see total sales by author. In this case you could create the following materialized view:

CREATE MATERIALIZED VIEW author_sales AS
SELECT 
    b.author,
    SUM(o.quantity * b.price) as total_revenue,
    SUM(o.quantity) as books_sold
FROM books b
JOIN orders o ON b.book_id = o.book_id
GROUP BY b.author;

Now you can query the materialized view just like you’d query any other view or table:

SELECT * FROM author_sales ORDER BY total_revenue DESC;

Result:

   author    | total_revenue | books_sold 
-------------+---------------+------------
Sarah Chen | 204.91 | 9
Marcus Webb | 64.98 | 2
Aisha Patel | 45.98 | 2

Now when you run SELECT * FROM author_sales, you’re getting the pre-computed results instantly. No joins, no aggregations. Just a simple lookup.

This simple example is obviously purely for demonstration purposes, so we might not see any performance benefits from using a materialized view in this case. But imagine if the tables contained lots of rows, and you needed to join them with several other tables. In that case, you might well see a significant performance benefit from using the materialized view.

Refreshing Materialized Views

The stored data in a materialized view becomes stale as the underlying tables change. To update it, you need to refresh the view. Different database systems handle this differently.

Suppose we insert another order into the underlying orders table:

INSERT INTO orders (book_id, quantity, order_date) VALUES
    (2, 3, '2024-10-25');

And we query the materialized view immediately:

SELECT * FROM author_sales ORDER BY total_revenue DESC;

Result:

   author    | total_revenue | books_sold 
-------------+---------------+------------
Sarah Chen | 204.91 | 9
Marcus Webb | 64.98 | 2
Aisha Patel | 45.98 | 2

The results are the same as before. Nothing has changed.

That’s because we haven’t refreshed the materialized view. Let’s go ahead and refresh it.

In PostgreSQL, you manually refresh materialized views using REFRESH MATERIALIZED VIEW:

REFRESH MATERIALIZED VIEW author_sales;

Now let’s run the query again:

SELECT * FROM author_sales ORDER BY total_revenue DESC;

Result:

   author    | total_revenue | books_sold 
-------------+---------------+------------
Sarah Chen | 204.91 | 9
Marcus Webb | 154.95 | 5
Aisha Patel | 45.98 | 2

This time it provides up to date information. In particular, we can see that Marcus Webb has now sold 5 books (instead of the 2 books that the previous query returned).

Oracle offers more flexibility with automatic refresh options. You can set up materialized views to refresh on commit (immediately when underlying data changes), on demand (manually), or on a schedule. Here’s what that might look like in Oracle:

CREATE MATERIALIZED VIEW author_sales
REFRESH COMPLETE ON DEMAND
AS
SELECT 
    b.author,
    SUM(o.quantity * b.price) as total_revenue,
    SUM(o.quantity) as books_sold
FROM books b
JOIN orders o ON b.book_id = o.book_id
GROUP BY b.author;

In this case, the ON DEMAND part dictates that the materialized view must be manually refreshed by a user (e.g., executing the DBMS_MVIEW.REFRESH procedure). It does not auto-refresh.

To enable automatic refresh for a materialized view in Oracle, you must use the ON COMMIT or START WITH/NEXT clauses:

CREATE MATERIALIZED VIEW author_sales
REFRESH FAST ON COMMIT  -- Refreshes automatically when base tables commit
AS ...

Or:

CREATE MATERIALIZED VIEW author_sales
REFRESH COMPLETE  -- Can be COMPLETE or FAST
START WITH SYSDATE   -- Time of the first refresh (e.g., immediately)
NEXT SYSDATE + 1/24  -- Interval for subsequent refreshes (e.g., every hour)
AS ...

SQL Server doesn’t use the term “materialized view” but provides similar functionality through indexed views. These are regular views with a unique clustered index that causes the results to be stored physically.

The refresh strategy you choose depends on your needs. If data changes frequently and you need relatively fresh results, you might refresh every few minutes or hours. For data that only changes daily (like end-of-day reports), refreshing once at night might be sufficient.

Complete vs. Fast Refresh

Some database systems offer different refresh methods. A complete refresh recalculates the entire view from scratch, which can be time-consuming for large datasets. A fast refresh (also called incremental refresh) only updates the changes since the last refresh, which is much quicker but requires the database to track changes in the underlying tables.

Oracle supports both approaches. For fast refresh to work, you typically need to create materialized view logs on the base tables that track which rows have changed:

-- 1. Create MV Log on BOOKS (specifying ROWID and necessary columns)
CREATE MATERIALIZED VIEW LOG ON books
WITH ROWID, SEQUENCE (author, book_id, price)
INCLUDING NEW VALUES;

-- 2. Create MV Log on ORDERS (specifying ROWID and necessary columns)
CREATE MATERIALIZED VIEW LOG ON orders
WITH ROWID, SEQUENCE (book_id, quantity)
INCLUDING NEW VALUES;

-- 3. Create Materialized View
CREATE MATERIALIZED VIEW author_sales
REFRESH FAST ON DEMAND
AS
SELECT 
    b.author,
    SUM(o.quantity * b.price) as total_revenue,
    SUM(o.quantity) as books_sold,
    COUNT(*) as cnt,
    COUNT(o.quantity) as cnt_quantity,
    COUNT(b.price) as cnt_price
FROM books b
JOIN orders o ON b.book_id = o.book_id
GROUP BY b.author;

Note that fast refresh has requirements in that your query needs to include certain aggregate functions and the COUNT(*) of rows for the database to calculate incremental changes correctly.

When to Use (and Not Use) Materialized Views

Materialized views can be ideal in certain scenarios. For example:

  • Data warehousing and business intelligence applications use them extensively because analytical queries are often complex and read-heavy, but the data doesn’t need to be real-time.
  • If you’re building summary tables for reporting, materialized views can handle the aggregation logic for you.
  • They’re also great for improving performance of frequently-accessed complex queries. If your application runs the same expensive query dozens of times per minute, materializing it makes sense. And when you’re working with remote data, materialized views can cache that data locally to avoid network latency.

However, materialized views aren’t a universal solution. If you need real-time data, the staleness issue becomes problematic. Frequently refreshing a large materialized view can be just as expensive as running the original query repeatedly, defeating the purpose. They also consume additional storage space since you’re essentially duplicating data, and maintaining them adds complexity to your database schema.

Don’t use materialized views as a band-aid for poorly written queries. If you can optimize your query or add appropriate indexes to make it fast enough, that’s usually a better solution than introducing materialized views with their refresh overhead and staleness concerns.

Materialized Views vs. Tables

You might be wondering, if a materialized view stores data like a table, why not just use a table? You certainly can populate a regular table with query results and query that instead. But the advantage of using materialized views is that the database manages them for you. The query logic stays with the view definition, so when you refresh it, the database automatically runs the correct query. With a regular table, you’d need to write and maintain separate ETL logic to populate it.

Materialized views also integrate better with query optimizers in some databases. The optimizer knows the relationship between the materialized view and its base tables, which can lead to better execution plans. Some systems can even automatically use a materialized view to answer a query even when you don’t explicitly query the view. This is a feature called query rewrite.

That said, if you need more control over the refresh process, custom logic for handling updates, or the ability to have triggers and constraints, a regular table with your own refresh logic might be more appropriate.

Database-Specific Considerations

Each database system implements materialized views a bit differently. For example:

  • PostgreSQL keeps things simple with manual refresh only, where you’re in full control of when the view updates. This simplicity is easy to understand but means you need to set up your own scheduling if you want automatic refreshes.
  • Oracle has a more sophisticated implementation with automatic refresh options, fast refresh capabilities, and query rewrite. It’s powerful but comes with a learning curve and specific requirements for your queries to work with fast refresh.
  • SQL Server’s indexed views are automatically maintained. When the underlying data changes, the indexed view updates immediately. This keeps data fresh but can impact write performance since every insert, update, or delete on the base tables must also update the indexed view. There are also restrictions on what kinds of queries can be used in indexed views.

Some Tips When Considering Materialized Views

When working with materialized views, start by identifying your slowest, most frequently-run queries. Those are your best candidates. Create the materialized view and test the performance improvement. Make sure the refresh time is reasonable. If refreshing takes longer than running the original query, you might need to reconsider your approach.

Set up a refresh schedule that balances data freshness with system load. Refreshing during low-traffic periods is usually best. Monitor the size of your materialized views. If they’re consuming too much storage, you might need to be more selective about what you materialize.

Remember that indexes can help with both the base tables and the materialized view itself. Creating appropriate indexes on a materialized view can make queries against it even faster.

Finally, document your materialized views well. Future developers (including yourself) need to understand what data the view contains, how often it refreshes, and what the staleness implications are for the application. Nothing’s worse than debugging an issue only to discover the data is stale because a materialized view hasn’t refreshed in days.

Summary

Materialized views can be a useful tool for optimizing read-heavy workloads where you can tolerate some data staleness. Used appropriately, they can dramatically improve query performance with relatively little effort. Just make sure you understand the trade-offs and choose your refresh strategy carefully.