Introduction and Relational Databases
I was learning SQL through SQLZOO, and at first, it looks easy. The deeper I got into it, and I realized that it can become all gibberish. I figured it might have to do with me not understanding the database entirely. I decided to learn database in order to be able use SQL proficiently. I will be using this blog to act as a note on what I learned about databases. I will be learning about database via Stanford University online course.
Keywords to learn
Database Management Systems (DBMS)
It provides a means of handling large amount of data primarily
concurrency control
Used to address conflicts with the simultaneous accessing or altering of data that can occur with a multi-user system
physical data independence
Data stored in database and directly unaffected or unadulterated by external users/programs, data remains the same
attributes
Another word for columns in database, each has a type or a domain like int, float, jpeg file
tuples
Another word for rows in database, hold values
Introduction
Why database?
- can hold number of terabytes data daily!
- outlives any programs that execute on that data
- preserves data, keeping them consistent regardless of what happens, such as a malfunction with: hardware, software, power, and users
- allows concurrency control where multi-users can access to and use data without overwriting
- designed to work easily with large amounts of data
Physical data independence is what makes database useful. Users and programs can do whatever with the data, but data in the database will remains the same, unmoved. Databases are usually queried by query languages that are relatively compact to describe the algorithm to get the data out. Query language is very declarative.
Database systems are often used in conjunction with middle-ware. Middle-ware might be application servers, web servers which help application to interact with database system.
There are four key concepts to start with:
data model
the description of how the data is structured, often in relational dot. Data would be in a set of records. Currently, popular format is XML document to store data. Another is graph data model.
schema versus data
Schema are types while data are variables. Database is sometime adhered to schema, which tells the structure of the database, to find where specific data are. Usually, schema is set up at the beginning.
data definition language (DDL)
DDL is used to set up schema or structure for a particular database.
data manipulation or query language (DML)
DML is used to querying or modifying the data.
Four key people involved with database:
DBMS implementer
Sets up system or database.
Database designer
Establish schema for databases.
Database application developers
Builds an application that runs on the database, interfacing between eventual user and the data itself.
Database administrator
Loads data and keeps it running smoothly.
Database systems have a number of tuning parameters associated. Can make a significant difference in the all important performances.
The Relational Model
- Database is a set of relations or tables
- Each relation has a set of named attributes or columns
- Each tuple or row has a value for each attribute
Let’s take a look at the relation Student below:
Each attribute has a type like int, or float, and or jpeg file. Type is sometime referred as a domain.
We can see schema in relation Student, mostly starting with attributes. Attributes are filled with instances. Instances are actual content at given point in time, and these do change over the time.
Schema decide to have data of ID, Name, GPA, and Photo. These are instances help identifying a “student” in table Student. Attributes is encouraged to be a detectable unique character or key. Key is attribute whose value is unique in each tuple.
Look at different relation College:
There are probably SEVERAL colleges with SAME NAME “Washington”, so it CAN’T be really a key BUT if state is added next to it. Two combined values of attributes together CAN become a key. Key is effective when used as a target for query.
Creating table in SQL
1
create Table Student(ID, Name, GPA, Photo)
Another example:
1
create Table College(name string, state char(2), enrollment integer)
Querying Relational Databases
The basic steps in creating and using relational database goes:
- design the schema
- create the schema using DDL (starting with attributes)
- load up the database with initial data (usually from external source)
People have tendency of drawing database as a barrel or a massive disk.
After that, the next step would be:
- query and modify the data
- might need to update or insert new data later on
A person who wants to use the database would ask queries and database will provide answers.
Queries would be such:
- all students with GPA > 3.7 applying to Stanford and MIT only
- all engineering departments in CA with < 500 applicants
- college with highest average accept rate over last 5 years
When a query 1 would goes over the relations and “create a new relation”, it is called closed. If there’s a query 2 coming up and covers the “new relation”, and existing relations then it is compositionality.
Query languages
Relational Algebra
It is a formal language, a very theoretically well-grounded.
SQL
An actual language, implemented language. It does have a foundation relational algebra. Semantics are defined.
Comments powered by Disqus.