DATA CAMP
DATA SCIENTIST WITH PYTHON
DATABASES
INTRODUCTION TO SQL
Relational databases
A relational database defines relationships between tables of data inside the database.
Possibly the biggest advantage of a database is that many users can write queries to gather insights from the data at
the same time.
More storage than spread sheets.
Can be secured with encryption.
SQL. SQL! SQL, or S-Q-L, is short for Structured Query Language.
Tables
Main building block of databases.
Tables are organized into rows and columns.
In the world of databases, rows are often referred to as records and columns as fields.
Table names should be (BEST PRACTICE):
o lowercase
o should not include spaces - we use underscores in place of spaces
o refer to a collective group / be plural
Record. Row that holds data.
Field. Column in a table.
Field name (BEST PRACTICE):
o Be lower case
o Have no spaces
o Be singular
o Be different from other field names
o Be different from the table name
A unique identifier, sometimes called a "key”
o Is just what it sounds like:
a unique value which identifies a record so that it can be distinguished from other records in the same
table.
They are unique and often numbers
Table topics should remain separated in different tables.
SQL data types
Different types of data are stored differently and take up different amounts of storage space.
Some operations only apply to certain data types.
o Strings
SQL VARCHAR data type is more flexible and can store small or larger data.
o Integers
o Floats
Schemas
o Schemas are often referred to as "blueprints" of databases.
o A schema shows a database's design, such as what tables are included in the database and any relationships
between its tables.
Database storage
o Finally, let's discuss storage. The information we find in a database table is physically stored on the hard disk
of a server.
Servers
o Centralized computers that perform services via requests made over a network. In our case, the service
performed is data access, but servers are also used to access websites or files stored on the server.
o Any computer can be a server if it is set up to provide a service, even a laptop!
Introducing to queries
Keywords are reserved words used to indicate what operation we'd like our code to perform.
BEST PRACTICES
o To end the query with a semicolon to indicate that the query is complete.
o We also capitalize keywords while keeping table and field names all lowercase.
o We can use * to select all field names.
Writing queries
Alias (Aliasing). Rename columns in our result set.
DISTINCT. Return unique combinations of a field or multiple fields.
Views.
o A view is a virtual table that is the result of a saved SQL “SELECT” statement.
o A benefit of views is that when accessed, views automatically update in response to updates in
the underlying data.
CREATE VIEW table AS SELECT field, field1, field2 FROM orig_table;
SQL Flavors (Versions)
For example, MS SQL, Oracle DB.
Can be both free and paid.
All used with relational databases.
Vast majority of keywords are the same.
All must follow universal standards set by the
o International Organization for Standards
o American National Standards Institute
Only the additions on the top of these standards make flavors different.
PostrgreSQL
o The POSTGRES project, led by Professor Michael Stonebraker, was sponsored by the Defense
Advanced Research Projects Agency (DARPA)
o Free and open-source relational database system
o Created at the University of California, Berkley
o Refers to both the PostgreSQL database system and its associated SQL flavor.
SQL Server
o Has free and paid versions
o Created by Microsoft
o T-SQL (Transact-SQL) is Microsoft’s flavor, used with SQL Server databases.
o T-SQL vs SQL. T-SQL is a variant of SQL, with some additions like Stored Procedures and Triggers.
Comparing PostgreSQL and SQL Server.
o Limiting results:
PostgreSQL: LIMIT 2;
SQL Server: top(2)