Category Archives: Development and testing

Is There a “Best Tool for a Job?”

We often hear the phrase: Choose the right tool for the job. On the surface, there is nothing to argue about. We have tools. Hammers and saws. Mixers and slow cookers. Each tool is designed to accomplish a specific task. Hammers for nails. Screw drivers for screws. Microscopes – for neither of them.

That’s true; however, I remember my grandaunt refusing to use a mixer for merengue. “I can do better with a fork!” she used to say – and she could, indeed! Like nobody else in the family! Or watch me when I come to the Youth shelter to make dinner with the residents and refuse to use a potato peeler. “I can do better with a knife!” – I tell them, and they roll their eyes and do not try to compete.

Back from the kitchen to the data center. There are operating systems, and there are database engines. There are programming languages and QA frameworks. And we do not use one in place of another.
What about “the best programming language for the job”? Do we have a way to decide objectively? What about databases? Can you tell: “This DBMS will be the best choice for you to accomplish your task?”
If your answer is “yes,” I will challenge it.

With a few exceptions and a handful of corner cases, most DBMSs can support virtually any requirements set. I can do anything possible and impossible using PostgreSQL, and my coworkers who are more proficient with Oracle would accomplish the same task using Oracle – faster than it would take them to master new PostgreSQL features.

I think “the best tool for a job” is the one you know best. The reason is that the developer’s time is always the most expensive resource, and if you can achieve a similar quality of the resulting product, the best solution is the one that takes the least time. That said, I do not think any company should support an extensive collection of different DBMSs for “having the best tools for a job.” The only justification for such a situation would be, “We already have developers who use this technology.”

Does it mean that we are stuck with our current technology stack forever? We all know that that’s not true, and changes happen. But I don’t know why 🙂

Postgres community, any thoughts?

Leave a comment

Filed under Development and testing

#PGSQLPhriday #010 – pgBadger

After missing a couple of #PDSQLPhriday challenges, I am finally back! This month’s host is Alicja Kucharczyk, and she chose the topic I could not miss – pgBadger! Thank you so much for this topic, Alicja! I can’t even describe how much I love pgBadger! It’s my favorite and often the only tool that helps me figure out “why everything is slow.” The best tool I have ever worked with!

I used it in many companies to a different extent. My ideal and most desirable situation is to have hourly and daily pgBadger reports for all production databases so that I can always go back and check the dynamics of execution for any query I am interested in. However, the most important for analyzing application performance features are in the “Top” menu. Specifically, I am talking about “most frequent queries” and “most time-consuming queries.” 

We all know that most times end users and application developers complain that “the database is slow,” the problem is not in the slow quire execution but in suboptimal database access patterns. That’s a polite way to say that when app developers use ORM, it results in generating tens and hundreds if not thousands of database calls per one screen rendering. pgBadger allows us to see these multiple database calls, and often, we can prove that the queries which take up the most time are not “the most long-running queries” but queries that are executed very fast but “too often.” 

Consequently, when/if we can convince application developers to make changes, we can monitor whether the number of calls has reduced.

The biggest challenge I had with pgBadger (and I had it multiple times) was to convince business users/DBAs/managers that it’s OK to turn on full logging (log_min_duration=0) and that this won’t make “everything slow.” The only concern is the space allocated for the log and prompt log files removal.

My “wishlist” for pgBadger consists of just one item: an option to keep track of repeating sequences of SQL statements (that’s another way to track what in the world the application is doing!)

Leave a comment

Filed under Data management, Development and testing, publications and discussions

What does “a busy system” mean?

A couple of weeks ago, I was fixing the problems with a system with “too many connections.” You know how these systems behave: applications initiate a large number of connections, most of which are idle, but at any given moment, there are a couple of dozen of active connections. Sometimes, the Postgres server reaches the maximum number of connections, and a DBA on call has to kill some idle sessions and then plan a system restart to increase the max_connections to yet another ridiculously large number. Sometimes, the system receives an “out of memory” OS error because each of these idle connections keeps the work memory that was initially allocated to it. Then we start hearing the conversations about moving this “very busy system” to a “bigger server” and also, “Why we won’t install pg_bouncer?”

When you read about PostgreSQL best practices and generic system architecture recommendations, you inevitably come across something like “On a busy system, you can increase this parameter to X” or “On a busy system, you may need to do Y.” One of the first questions on all known customer questionnaires is “What’s expected an average and maximum IOPS?” Then, recommendations are given/resources are planned based on the answer. 

I believe this planning starts in the wrong place. The first question we should ask is, “What makes your system busy?” 

Do not take me wrong; I do not imply that there is no such thing as a “high I/O system.” Still, the first question we should ask is, what’s the reason for the expected high I/O? Does the system serve millions of customers at any given moment? How many SQL statements are executed on average between the moment the user presses “Enter” and the results appear on the screen? 

If applications are using connection pools, they are already consolidating individual customer requests into a much smaller number of sessions. Will one more level of abstraction improve the system throughput? In many cases, it will only make things worse.

In my case, I talked to application developers showing them the percentage of idle sessions and estimated resource consumption. On the first take, they reduced the average number of open connections from 2,700 to a little over 1,000. I kept insisting on more research, and after one more week of digging into the application code, the average number of connections was reduced to sixty! 

There are many reasons for the system being busy. Sometimes it indeed provides the geodata to millions of drivers simultaneously, but in other cases, it may be just connection leakage!

As I always say, tuning system parameters may improve performance up to ten times, tuning individual queries may make everything run tens times faster, and optimizing the system architecture can make things run hundreds, up to thousand times faster!

Leave a comment

Filed under Development and testing, Systems

Adding partitions to bitemporal model

Until recently, I never tried to partition bitemporal tables because I could not find a good way of partitioning them and because my projects never grew big enough for partitioning becoming a necessity. However, the need became apparent two months after I started with DRW!

You can see the results in the last commit to pg_bitemporal. The changes were committed about a month ago, but I didn’t announce them, and that’s what I finally doing :).

There is no change to ll_create_bitemporal_table, so no changes are required to the existing code. I added two more functions: ll_create_bitemporal_table_partitioned and ll_create_bitemporal_partition.

This first version of bitemporal tables partitioning is very restrictive. The only partitioning currently supported is by range on the business key. The primary reason for this is the need to keep bitemporal operations performant, which means that we can’t store any adjacent records with the same business key in different partitions. You can think about it as all the records related to one object are, in some sense, “one object” and thereby can’t be separated. I believe it should be possible to expand the partitioning by adding more columns to the business key (the same way as for regular tables, the primary key should be a part of the partitioning key.). 

I think I will implement it in the next version, but for now, ll_create_bitemporal_table_partitioned will create a table partitioned by range on the business key, and ll_create_bitemporal_partition will create a partition for a specified range. 

I think it is a good start because it dramatically increases the scope of potential usage of bitemporal tables – check it out!

Leave a comment

Filed under Development and testing

One of many dumb things I did recently…

I know that many people are reluctant to admit they made some stupid mistakes (like accidentally dropping a table in production). Some of these mistakes are very visible to the rest of the world, like the infamous Amazon S3 rm *, but raise the hands those who never ever did anything like this?! That’s why I always try to share my “oops,” in hope that it will make many people feel better.

Now, here is my recent story.

I spent two workdays trying to figure out why my stored procedure failed on commit. It was failing with an error “can’t commit while a subtransaction is running.” I am looking at the code, and there is no subtransaction! And I was trying to figure it out for two days! The use case was a classic one for “why we need to commit inside a stored procedure: we have a loop for a hundred million rows, and we want to commit, say, every 100K or 0.5M.

I read and re-read all related PostgreSQL documentation, all mailing lists, and all blog posts I could find, and I still could not figure out how my example differs from other people’s examples which work.

Out of total desperation, I started to remove parts of the code, hoping that, at some point, I would remove something which prevented it from working.

Guess what? I am down to i:=i+1, and it still returns the same error!

And then, finally, I realized what the problem was! I had exception handling!!! Which implicitly creates a checkpoint!!! Which I theoretically knew but, for some reason, I didn’t associate it with a subtransaction.

Now I knew what was wrong, and I knew that the solution was to create a block inside the loop and put the exception handling there. Great.

Doing this, sitting and waiting for commit messages, and nothing happens. I know that the commit message should appear within 20 min, and I am already waiting for two hours. Since this is happening not in production but in my sandbox, I let it run… until it hits “out of disk space.”

Next thing I realized – when I was putting back into my code all of the meaningful pieces I removed, I accidentally deleted i:=i+1.

😂 😂 😂

2 Comments

Filed under Development and testing, SQL

NORM_GEN Q&A

My second talk at PG Conf EU was a joint talk, so I only briefly mentioned my work on the NORM project. However, since many conference attendees heard about it for the first time, I feel that I need to fill them in (even if I sound like a broken record for many others:))

So here comes a second summary.

  1. What problem are you trying to solve? I am trying to solve the whole hierarchy of problems. Application developers tend to use ORMs instead of writing embedded SQL statements because of insufficient knowledge of SQL and pressure to deliver MVP. ORMs generate a large number of “small queries,” most often selecting multiple records by their primary key values. These queries block the optimization capabilities of PostgreSQL (and any other database, for that matter), and the application response time grows linearly while all queries perform very well. We are giving application developers a tool to interact with databases in terms of complex objects, thus drastically reducing the I/O overhead.
  2. Who is driving the database design when using NORM methodology? Database design is driven by database developers. We do not automatically generate tables or indexes of other storage structures. A transport object defined by a JSON schema describes a hierarchy used in a microservice or a controller. There can be multiple hierarchies needed by different applications or different parts of the same application, and we can provide support for all these hierarchies using the same database tables. The only database objects we generate are user-defined types and functions.
  3. Does it mean we need a separate JSON schema for each application controller? Most likely, yes. The worst thing application developers can do with NORM is to reuse several existing mappings to create a new one.
  4. How is NORM different from ORM? Isn’t it just another ORM? NORM is different because it allows applications to operate on data sets rather than on individual records of simple structure. Without exposing it to an application developer, NORM opens opportunities to utilize the power of relational search.
  5. Is it necessary to use functions? No, if application developers are capable and willing to construct the necessary SQL statements, functions will not be needed.
  6. Why do application developers often have difficulties with SQL? Because SQL is a declarative language and operates on sets, while OO languages are imperative and operate on individual objects.

Same as with the previous post, please leave a comment if I missed any questions (or if you have more :))

2 Comments

Filed under Development and testing, research

Bitemporal Q&A

After my bitemporal presentation, people asked lots of questions, and more questions followed later during the conference and also on Twitter and LinkedIn.

In this blog, I will try to summarize all my answers and give more details on the project development.

  1. What is my take on DDL changes in the bitemporal framework? I do not have a good answer for that. When I used bitemporality in the production environment and had to change a table structure, I would either add a column with NULL values for all the previous records or the opposite – set a default value to null for the new records. I believe that there is no fundamental difference between “not having” a column and “setting the value to NULL.” If you disagree, I am happy to discuss it!
  2. In order for bitemporal operations to look “less ugly,” can we create updatable views and hide bitemporal operations in INSTEAD triggers? Not really, because in this case, you won’t be able to pass effective and asserted ranges, and also, you can’t distinguish between update and correction.
  3. Can I use pg_bitemporal? Is it “in production?” I maintain my fork (hettie-d/pg_bitemporal), which means that I review all the bugs and do my best to fix them in a couple of days or suggest a workaround (or explain why what you are trying to do is not going to work). I used this framework for five years at Braviant Holdings, and to the best of my knowledge, the company is still using it a year and a half after I left with no major issues. One more startup has been using it for at least six months, and they not only report the bugs to me but also offer improvements.
  4. Did you think about making it an extension? I did, and there are two reasons I never got to it. First, I want people to be able to try it in different environments, and if a company uses any cloud-managed service, they can’t install unapproved extensions. Second, I have neither skills to package it nicely nor the time to learn them. I will be most grateful for any assistance, but for a reason mentioned before, I would be even happier to see it as a part of the Postgres core, hopefully, rewritten better and more efficiently.
  5. How do you support foreign keys in the bitemporal model? I still have a formalize the bitemporal referential integrity constraints creation, but the idea is very simple. You can create triggers on a regular insert that are similar to the internal Postgres triggers used to support the regular referential integrity. Here is why you need to check the parent values on insert only. In the bitemporal model, the tuples are not updated except for the changes in the asserted range (check the diagrams from my presentations). With this, the only thing we need to check is whether, when we insert a physical record, the parent key is effective and asserted at the effective start and asserted start and to infinity. And one more trigger should prevent the parent key from being deleted if there are dependent child records. 
  6. Can I use one-dimensional time? Yes, but in that case, you loose the ability to correct (or rather distinguish between correction and update) and future assertions.
  7. What if you need to actually delete something due to legal requirements? You will need to work with your customer on a case per case basis explaining the consequences, but at the end of the day, you need to do what is required by law.
  8. Users have to add time constraints to all queries, is there any way to help them? From my experience, as soon as the users understand what they are getting while using the bitemporal framework, they lear to add a “current” time pretty fast and are not get frustrated. Also, we provided them with many “canned” reports wth parameters, thud “hiding” the time dimensions filters

Please leave a comment if I didn’t answer a question you have, I wil be happy to answer!

Leave a comment

Filed under Development and testing, research

NORM_GEN is Finally Ready!

Hi friends, colleagues, and followers! Those who watch my GitHub repo NORM might have noticed the massive changes pushed last week.

As of today, I am happy to report that all the parts and pieces are in place, along with the demo/example of usage, so NORM_GEN is officially completed and ready for beta-testing!

To download and start using it, go to the NORM_GEN directory of the NORM project. 

What is NORM_GEN?

NORM_GEN is a set of packages that can be used to automate the process of building NORM functions.

To automate this process, we need to present a contract more formally. We present a contract as a JSON schema, then parse it and store the results in meta tables, later used to build the types and functions.

Quick Start

To use NORM_GEN, run the file _load_all_norm_gen.sql from this directory on your local Postgres database. It will create a NORM_GEN schema with metadata tables and deploy the following packages:

process_schema

build_conditions

build_return_type

build_select

build_to_db

generate_select_by_ids

generate_search_generic

generate_to_db_function

In addition, it will create a function ts_all which runs all functions from the process_schema package and populates all metadata.

Demo

For a demo, we use a JSON schema that describes a user account – see user_account.json

This JSON schema describes mapping the User account transport object to the database. The database schema can be found in

../sql/create_tables.sql

To see how NORM_GEN works, you can follow the steps listed in the

ts_all_call.sql

More examples of JSON schemas can be found in:

flight-booking.json and flight-boarding.json

Enjoy! And please report all issues!

Leave a comment

Filed under Data management, Development and testing

NORM News

For those who follow the NORM project and specifically our most recent addition, NORM_GEN, I have some exciting news. As we continue the work on automation of the creation of the functions and on making them more usable for application developers, we made two critical improvements:

First, the build_conditions function now accepts the MongoDB syntax.
For example, the call:

select norm_gen.build_conditions(
'{
"User account":{
"phone_type":"cell",
"email_priority":"primary",
"account":{"last_name":"johns",
"emails":{"email_address":{"$like":"%gmail%"}},
"dob":{"$gt":"1901-01-01"},
"phones":{"phone_number":{"$like":"312%"}}
}
}}'
::json);

will generate the following conditions:

account_id IN (
select account_id from norm.phone where
phone_id IN (
select phone_type_id from norm.phone_type where
phone_type = ('cell'::text) )
AND phone LIKE ('312%'::text) )
AND account_id IN (
select account_id from norm.email where
email_id IN (
select email_priority_id from norm.email_priority where
email_priority = ('primary'::text) )
AND email LIKE ('%gmail%'::text) )
AND last_name = ('johns'::text) AND dob > ('1901-01-01'::date)

Second, for any specific hierarchy, we can generate its own search function using norm_gen.generate_search_generic_function.

For example, generic search for account will look like this:

create or replace function norm.account_search_generic(p_search_json json
) returns norm.account_record[]
language 'plpgsql'
as $BODY$
declare
v_result norm.account_record[];
v_sql text;
begin
v_sql:=norm_gen.nested_root('User account')||
$$ where $$||norm_gen.build_conditions(('{
"User account":'||p_search_json::text||'}')
::json);

execute v_sql into v_result;
return (v_result);
end;
$BODY$;

Note that calling norm_gen.nested_root allows us to pick up a new type definition instantaneously after it was changed.

You do not need to write this search; everything is done automatically. Now you can call a function

select norm.account_search_generic($${
"phone_type":"cell",
"email_priority":"primary",
"account":{"last_name":"johns",
"emails":{"email_address":{"$like":"%gmail%"}},
"dob":{"$gt":"1901-01-01"},
"phones":{"phone_number":{"$like":"312%"}}
}
}$$::json)

And it will return exactly the result you expect!

{"account_id":1,
"username":"aliceacct1",
"last_name":"johns",
"first_name":"alice",
"dob":"1996-04-01",
"emails": [{"email":"[email protected]","email_id":1,"email_priority":"primary","email_priority_id":1},{"email":"[email protected]","email_id":2,"email_priority":"primary","email_priority_id":1}],
"phones":[{"phone_id":1,"phone_type":"cell","phone":"2021234567","phone_type_id":2},{"phone_id":2,"phone_type":"cell","phone":"3121233344","phone_type_id":2}]}

And no application developer needs to know that there is pure PostgreSQL inside!

Dear application developers, software engineers, and anybody else who might be interested – would you try it out :)?

Leave a comment

Filed under Development and testing, SQL

Believing in Magic of Parameters…

It continues to surprise me how much people believe that there is a magical combination of PostgreSQL configuration parameters that can suddenly make everything fly. I’d say if such a magical combination would exist, why we won’t make it default, and then there will be no need to tune anything – ever! Still, we modify parameters – all the time. Another surprising fact is that people are more willing to modify system parameters than to tune one single SQL statement. 

I do not understand that logic. When you change parameters, you change them for the whole system, so each SQL statement will be affected. And although we know how parameters tuning should affect the database performance, we are not verifying how each SQL statement will perform after parameters change.

In contrast, when you tune one specific SQL statement, you modify just this single statement, and other statements’ performance will remain unchanged. From my perspective, it means that the change results are more predictable, and the change is easier to implement. I am not sure why such suggestions are usually received with skepticism.

Recently, I suggested a customer rewrite one SQL statement in their system performance test. This statement was executed multiple times during the test, and I was pretty confident that a minor rewrite would drastically improve performance.

The results exceeded my expectations: this minor rewrite cut the total execution time of the test from 46 hours to 23 hours! twice! Without any configuration parameters changes 🙂

Leave a comment

Filed under Development and testing, SQL