关系型数据库管理系统（Relational Database Management System，RDBMS）
- 候选键 ：表中可以唯一确定一个元组的某个属性（或者属性组）叫候选键
- 主键 : 从许多候选键(可包含多个属性)中挑一个用来维护记录的完整性就叫主键。
- 外键 ：一个属性（或属性组），它不是这个表的主键，但是它别的表的主键，它就是这个表的外键。
- 主属性 ：一个属性只要在任何一个候选键中出现过，这个属性就是主属性。
- 非主属性 ：与上面相反，没有在任何候选键中出现过，这个属性就是非主属性。
A data model is a description of the data in a database. In addition, the data model describes the relationship among data, and any constraints that have to be defined on the data. Data models can broadly be classified into the following categories.
- Object-based logical model: Focuses on desctibing the data, the relationship among the data, and any constraints defined.
- Record-based logical model: Focuses on specifying the logical structuring of the database.
ER模型是基于对象的逻辑模型(Object-Based Logical Model)
The ER model views the real world as a collection of objects or entities and the relationship among them.
Entities:a thing, whick can be easily identitfied.
Attributes: an attribute is a property of given entity.
Relationships: an association among entities.
A relationship type is an association of entity types. A relationship instance is an association of entity instances.
A relationship can associate an entity with itself.
- One-to-Many(or Many-to-One)
A subtype is a subset of another entity.
A subentity or subtype is always dependent on the supertype for its existence.The attributes of a supertype apply to all of its subtypes. The converse is not true.
The supertype is connected to the relationship with a line containing a crossbar, as indicated in the preceding diagram. The attributes, which describe supertype will also belong to all subtypes. The subtype is described by the attributes that are unique to it.
In this model, data is represented in the form of a tree, and relationships between the data are represented by links.
In this model, the data and relationships among them are represented in the form of records and links. It is similar to a hierarchical model. However, records in a databse are represented by graphically.
In this model, the table in a database has a fixed length record with fixed number of attributes or fields. Of these three models, the relational model is the most popular.
An RDBMS can be defined as a database management system where all the data visible to the user is organized strictly as tables of data values, and all the database operations work on these tables.
The basic principle in a relational database is a tabular arrangement of data values. Each table in a databse has a unique table name that identifies its contents.
The row(or record) in the table is called a tuple, and the column(or field) is called an attribute. Every column in a table must have a unique name. The number of tuples is called the cardinality of the table, and the number of attributes is called the degree of the table.
One important property of a relation is that its rows are unordered. Every table must have some column or conbination of columns that uniquely identifies each row in the table.
No two rows of table contain the same values in that column or column conbination. This column is called the primary key.
A table where every row is different from all other rows is called relation.
Each attribute of a table has an associated domain(域). A domain is a collection of values from which one or more attribute(columns) draw their actual values. The domain lists the possible values for an attribute. (域就是所有取值可能)
The restrict operator extracts specified tuple or rowss from a given relation besed on a condition.
The project operator extracts specified attributes or columns from a given relation.
The product operator joins two relations such that every tuple of the first relation is matched with every tuple of second relation. To be product compatible, the two tables must have common attributes.
The union operator builds a relation from tuples appearing in erither or both of the specified relations where as intersect operator builds a relation consisting of tuples that appear in both the relations.
The intersect operator builds a relation consisting of tuples that appear in both the relaiton.
The difference operator builds a relation of tuples appearing in the first but not in the second of the two specified relation.
The join operator builds a relation from two specified relations. This relation consists of all the possible combinations of tuples, one from each relation that satisfies the specified condition. The join operation requires a common attribute.
The divide operator takes two relations and builds another relation consisting of values of an attribute of one relation that match all the values in the other relation. The divide operation is the opposite of the product operation.
- Regular entities
- Weak entities
- Subtypes and supertypes
A weak entities is an entity whose existence depends on some other entity. The weak entities have the composite key, part of which consists of the primary key of the entity they depend on.
- It identifies its owner entity.
- It refers to another entity.
- It simplifies the description of an entity.
Any attribute(or set of attributes) that uniquely identifies a row in table is a candidate for the primary key(主键). Such an attribute is called a candidate key(候选键). One of the candidate keys is chosen to be the primary key, based on the familiarity and greater usage. Any attribute that is a candidate for the primary key but is not the primary key is called the alternate key(替代键). When the key that uniquely identifies the row of the table is made up of more than one attribute, it is called a composite key(组合键).
Some attributes may acquire further attributes to qualify themselves during databses design and become entities.
Specialization(特化) is the result of taking a subset of a higher-level entity set to form a lower-level entity set.
Generalization(概化) is the result of taking the union of two or more lower-level entity set to produce a higher-level entity set. It is the opposite of specialization.
Normalization is a method of breaking down complex table structures into simple table structures by using certaion rules. Using this mathod, you can reduce redundancy in a table, and eliminate the problems of inconsistency and disk space usage. You can also ensure that there is no loss of information.
Normalization has the following benefits
- It helps in maintaining data integrity(完整).
- It helps in simplifying the structure of tables, therefore, making a database more compact(简洁).
- It helps in reducing the null values, which reduces the complexity of data operation.
Achieve a good database design
- Each table should have an identifier
- Each table should store data for a single type of entity.
- Columns that accept NULLs should be avoided.
- The reprtition of values or columns should be avoided.
A table is said to be 1NF when each cell of the table contains precisely one value.
- Place the related data values in a table. Further, define similar data values with the column name.
- There should be no repeating group in the table.
- Every table must have a unique primary key.
部分依赖(partial dependency) 不完全依赖于键，如有组合键时只依赖于组合键中的某一个键
依赖传递(transitive dependency) x->y->z
A table is said to be in 2NF when:
- It is in 1NF, and
- No partial dependency exists between non-key attributes and key attributes.
- 定义： 符合1NF的前提下, 当表存在组合主键的时候, 每一个非键属性都必须完全依赖于组合主键中的所有主键, 而不能只依赖部分主键
- Find and remove attributes that are functionally dependent on only a part of the key and not on the whole key. Place them in a different table.
- Group the remaining attributes.
A relation is said to be in 3NF if and only if:
- It is in 2NF, and
- No transitive(indirect) dependency exists between non-key attributes and key attributes.
- Find and remove non-key attributes that are functionally dependent on attributes that are not the primary key. Place them in a different table.
- Group the remaining attributes.
The original definition of 3NF was not sufficient in some situations.
- That had multiple candidate keys.
- Where the multiple candidate keys were composite.
- Where the multiple candidate keys overlapped(had at least one attribute in common).
A relation is in BCNF if and only if every determinant is candidate key.
- Find and remove the overlapping candidate keys. Place the part of the candidate key an th attribute it is functionally dependent on, in a different table.
- Group the remaining items into a table.
The intentional introduction of redundancy in a table in order to improve performance is called denormalization.