Database Services and MySQL Guide
Database Services and MySQL Guide
Database Services
Arif Husen
➢ Types of DBMS
In general, there are two common types of databases:
▪ Non-Relational
▪ Relational
➢ Non-Relational Database Management System (Non-RDBMS)
▪ In Non-RDBMS, data is stored in key-value pairs. For example:
Customers = [ {id: 1, name=furqan, age=40}, {id=2,name=aisha,age=22}]
Tables
Types Elements
Association Characteristic
Kernel Table Columns Rows
Table tables
Indexes Keys
Types of Tables
• Tables that are independent • Tables that represent a relationship • Tables whose purpose is to qualify
entities. Kernel tables often among entities. or describe some other entity.
represent or model things that exist
in the real world. • For example, an order represents • Characteristic only have meaning in
an association between a customer relation to the entity they describe.
• Some example kernel tables are and goods.
customers, vendors, employees, • For example, order-lines might
parts, goods, and equipment. describe orders; without an order,
an order-line is useless.
Elements of Tabe : Row
➢ A row is a single occurrence of the data contained in a table; each row is treated as a single unit. In the
Customer table image in Tables, there are four rows, and each row contains information about an
individual customer.
Elements of Tabe : Column
➢ Rows are organized as a set of columns (or fields). All rows in a table comprise the same set of columns.
In the Customer table image in Tables, the columns are Cust Number, Name, and Street.
<Name>
Elements of Tabe : Keys
➢ Keys identify a unique row
▪ A primary key is a column (or group of ▪ A foreign key is a column value in one
columns) whose value uniquely identifies each table that is required to match the column
row in a table. Because the key value is value of the primary key in another table.
always unique, you can use it to detect and
prevent duplicate rows. ▪ If the foreign key value is not null, then the
primary key value in the referenced table
▪ A good primary key has the following must exist.
characteristics:
▪ It is this relationship of a column in one
❑ mandatory table to a column in another table that
❑ unique provides the relational database with its
❑ stable ability to join tables.
❑ short
Elements of Tabe : Indexes
➢ An index in a database operates like the index tab on a file folder. It points out one identifying column, such
as a customer's name, that makes it easier and quicker to find the information you want.
▪ A single column can be used to define a simple index, or a combination of columns to define a
composite or compound index.
▪ To decide which columns to use, you first need to determine how the data in the table is accessed.
▪ If users frequently look up customers by last name, then the last name is a good choice for an index.
It is typical to base indexes on primary keys
Database Schema
➢ A database schema defines how data is organized within a relational database.
▪ It covers logical constraints such as, table names, fields, data types, and the relationships
between these entities.
➢ Schemas commonly use visual representations to communicate the architecture of the database,
becoming the foundation for an organization’s data management discipline.
➢ SQL is generally used with relational databases, however there is no standard way of using non-relational
databases
MYSQL
➢ Applications
▪ Huge websites like Facebook, Twitter, Airbnb, Booking.com, Uber, GitHub, YouTube, etc.
▪ Content Management Systems like WordPress, Drupal, Joomla!, Contao, etc.
▪ A very large number of web developers around the world
Common RDBMS
• MySQL,
• Microsoft SQL Server
• Oracle
• Microsoft Access.
Installing MYSQL Server
➢ Installation MySQL Service
mysql --version
sudo mysql_secure_installation
sudo mysql
GRANT CREATE, ALTER, DROP, INSERT, UPDATE, DELETE, SELECT, REFERENCES, RELOAD
on *.* TO ‘username'@'localhost' WITH GRANT OPTION;
FLUSH PRIVILEGES;
▪ It assigns users permissions to work on the Database Server Management and Maintenance
activities.
▪ Here are some of the important tasks that can be performed with mysqladmin.
▪ Provides defined and improved structure of settings vital for the performance of the MySQL Server.
▪ Visualize the flow by displaying a graphical representation, making it easy for users to read, interpret, and
fine-tune the settings of the MySQL Server.
▪ Takes care of Security Risk Management, hence you can feel super safe working around your data.
▪ Easily import and export data files from the MySQL Server depending on the limited file size.
▪ It maintains User Accounts, their Passwords, and is also capable of locking or unlocking users whenever
needed.
▪ It provides open-source flexibility and secure transactional support with high scalability and continuous
uptime.
Administering MYSQL with mysqladmin
Create a new Database. create db_name
Delete a Database. drop db_name
Check the status of all MySQL Server variables. extended-status
Flush all information in the host cache. flush-hosts
Flush all tables. refresh
Set a new password. password new_password
Stop the server. shutdown
Display the server variables. variables
Change the MySQL Root Password mysqladmin -u root password [New_Password]
Change the MySQL Root Password mysqladmin -u root -pOld_Password password 'New_Password'
see if your MySQL server is up and running. mysqladmin -u root -pPassword ping
Check MySQL Server Uptime mysqladmin -u root -pPassword status
check the version of the MySQL server mysqladmin -u root -pPassword version
Check the Status of a MySQL Server mysqladmin -u root -pPassword status
Extended Status of a MySQL Server mysqladmin -u root -pPassword extended-status
Check MySQL Server Variables mysqladmin -u root -pPassword variables
Check the MySQL Process List mysqladmin -u root -pPassword processlist
kill the MySQL client process mysqladmin -u root -pPassword kill 195003
phpmyadmin
▪ Security
▪ It has some built-in security features, but it may not be as robust as Debian.
▪ Stability
▪ It is generally stable but may not be as reliable as Debian.
▪ Compatibility
▪ It may require additional steps to install and configure PHPMyAdmin on a separate system.
▪ Familiarity
▪ It may require some learning if not familiar with Ubuntu.
Installaing PHPMYADMIN
➢ Update APT Repositories
➢ Install phpMyAdmin
phpenmod mbstring
➢ Relational Databases are not suitable for large data, unstructured or semistrucutre data
• Store data as semi- • Store data as key- • These databases • Store data as nodes
structured value pairs, and are store data as column and edges, and are
documents, such as optimized for simple families, which are designed to handle
JSON or XML, and and fast read/write sets of columns that complex relationships
can be queried using operations. are treated as a between data.
document-oriented single entity.
query languages
• Optimized for fast and
efficient querying of
large amounts of
data.
Key Characteristics of NoSQL
➢ Key Characteristics of NoSQL databases are
• Do not have a fixed • Designed to scale out • Some NoSQL Dbs, such • NoSQL based Redis,
schema and can by adding more nodes as MongoDB, use a use a key-value data
accommodate changing to a database cluster, document-based data model, where data is
data structures without making them well-suited model, where data is stored as a collection of
the need for migrations for handling large stored in semi- key-value pairs.
or schema alterations. amounts of data and structured format, such
high levels of traffic. as JSON or BSON.
• Use sharding for horizontal scaling • Designed to handle • Auto replication feature in
• Vertical Scaling Complex unstructured or semi-structured NoSQL databases makes it
• MongoDB, Cassandra are data, which means that they can highly available because in case
examples of horizontal scaling accommodate dynamic changes of any failure data replicates
DBs. to the data model. itself to the previous consistent
state.
• Handle a huge amount of data, as
the data grows it scale itself to • This makes it suitable for
handle that data in an efficient applications that need to handle
manner changing data requirements.
• Designed to handle large amounts • More cost-effective than • Ideal for agile development.
of data and traffic, which means traditional relational databases,
that they can offer improved as they are typically less
performance compared to complex and do not require
traditional relational databases. expensive hardware or
software.
Merits/Demerits
➢ They can also be used for other applications, such as content management systems, document
management, and customer relationship management.
➢ NoSQL databases may not be suitable for all applications, as they may not provide the same level of data
consistency and transactional guarantees as traditional relational databases.
➢ It is important to carefully evaluate the specific needs of an application when choosing a database
management system.
Available NoSQL Databases
➢ Elasticsearch is a distributed, free and open search and analytics engine for all types of data,
including textual, numerical, geospatial, structured, and unstructured.
▪ Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now
known as Elastic).
➢ It is known for its simple REST APIs, distributed nature, speed, and scalability, Elasticsearch is the
central component of the Elastic Stack, a set of free and open tools for data ingestion, enrichment,
storage, analysis, and visualization.
▪ Commonly referred to as the ELK Stack (after Elasticsearch, Logstash, and Kibana), the Elastic Stack
now includes a rich collection of lightweight shipping agents known as Beats for sending data to
Elasticsearch.
Operations of ES
➢ Raw data flows into Elasticsearch from a variety of sources, including logs, system metrics, and web
applications.
▪ Data ingestion is the process by which this raw data is parsed, normalized, and enriched before
it is indexed in Elasticsearch.
➢ Once indexed in Elasticsearch, users can run complex queries against their data and use aggregations to
retrieve complex summaries of their data.
Elasticsearch index
➢ An Elasticsearch index is a collection of documents that are related to each other. Elasticsearch
stores data as JSON documents.
▪ Each document correlates a set of keys (names of fields or properties) with their corresponding values
(strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).
➢ Elasticsearch uses a data structure called an inverted index, which is designed to allow very fast
full-text searches. An inverted index lists every unique word that appears in any document and
identifies all of the documents each word occurs in.
▪ An inverted index is a mapping of each specific ‘word' (token) to the list of documents (locations)
containing that word, allowing users to easily find documents containing given keywords. Index data is
contained in one or more partitions, also defined as shards. Elasticsearch also automatically distributes
and allocates shards to cluster nodes.
➢ During the indexing process, Elasticsearch stores documents and builds an inverted index to make
the document data searchable in near real-time. Indexing is initiated with the index API, through
which you can add or update a JSON document in a specific index.
Elasticsearch index
Architecture of Elasticsearch
Visuali
M
zation Data Ingestion
D Shard Documents
DataNode
Document Store
ES Cluster Index
Basic Terms in ES
➢ Some basic terminology related to ES are
➢ Elasticsearch is fast.
▪ built on top of Lucene which excels at full-text search.
▪ a near real-time search platform, Lowest Latency, typicaly 1s
❑ A requirements for time-sensitive use cases such as security analytics and infrastructure monitoring
➢ Website search
➢ Enterprise search
➢ Security analytics
➢ Business analytics
ES Node Roles
➢ The main Node roles of ES are as follows
Master Remote_cluster_client ml
• A node that has the master role,
which makes it eligible to be elected • A node that has the • Allows to use machine learning
as the master node, which controls remote_cluster_client role, which features, there must be at least one
the cluster. makes it eligible to act as a machine learning node in your
remote client. cluster.
▪ Used to read, preprocess and ship data from sources that come in the form of log files.
▪ Filebeat further supports a number of other data sources including TCP/UDP, containers, Redis, and
Filebeat Syslog.
▪ Large No. of module ease collection and parsing of log formats for applications such as Apache, MySQL,
and Kafka.
▪ System metrics include information about running processes, as well as CPU / memory / disk / network
Metricbeat utilization numbers.
▪ Can collect data from many different services including Kafka, Palo Alto Networks, Redis, and many more.
•Collects and preprocesses live networking data, therefore enabling application monitoring, as well as security
and network performance analytics.
Packetbeat
•Among others, Packetbeat supports the following protocols: DHCP, DNS, HTTP, MongoDB, NFS, and TLS.
Elasticbeats
•is all about capturing event logs from Windows operating systems, including application events, hardware
events, and security and system events.
Winlogbeat
•The vast information available from the Windows event log is of much interest for many use cases.
▪ detects changes to critical files and collects events from the Linux Audit Framework.
Auditbeat
▪ Different modules ease its deployment, which is mostly used in the security analytics use cases.
Functionbeat ▪ collects logs and metrics from within a serverless environment such as AWS Lambda.
Logstash
➢ Logstash is a powerful and flexible tool to read, process, and
ship data of any kind.
▪ Logstash provides several capabilities that are not
currently available or too costly to perform with Beats,
such as enriching documents by performing lookups
against external data sources.
➢ However, this functionality and flexibility of Logstash
comes at a price. Also, hardware requirements for
Logstash are significantly higher than for Beats.
➢ A common architectural pattern is to combine Beats and Logstash: use Beats to collect data and use
Logstash to perform any data processing that Beats are not capable of doing.
Logstash Pipeline
➢ Logstash works by executing event processing pipelines, whereby each pipeline consists of at
least one of each of the following:
▪ Process and enrich the data in various ways.
▪ Filters can parse CSV, JSON, key/value pairs,
delimited unstructured data, and complex unstructured
data on the basis of regular expressions (grok filters).
▪ Enrich data by performing DNS lookups, ▪ write the parsed and enriched data
▪ Read from data sources such as adding geoinformation about IP addresses, or to data sinks and are the final stage
by performing lookups against a custom of the Logstash processing pipeline.
files, http, imap, jdbc, kafka, syslog, dictionary or an Elasticsearch index.
tcp, and udp.
▪ Additional filters allow for diverse ▪ While many output plugins are
transformations of the data, for example, to available, here we focus on ingestion
rename, remove, copy data fields and values into Elasticsearch Service using the
(mutate filter). Elasticsearch output.
Logstash Example Pipeline