Transcript

Expand

Model behavior: An Introduction to data models

Data models:

  • Organizes data elements.

  • Standardizes relation of data elements to each other.

  • Standardizes relation of data elements to properties of real-world entities.

Database models:

  • A type of data model.

  • Determines a database’s logical structure.

  • Determines how data can be stored, organized, and manipulated.

A normalized relational database and normalized star schema are two of the most common database models, each of which as its advantages.

Normalized relational database

  • Uses tables to make databases independent of software applications.

Database normalization:

  • Process of structuring relational database.

  • Reduces data redundancy and improves data integrity.

  • Organizes columns or attributes in tables.

  • Enforces various data integrity constraints.

  • Must meet requirements for previous normalization level before next level can be achieved.

Unnormalized data:

  • May or may not have primary key.

  • Table does not have duplicate records.

First normal form (1NF):

  • Requires primary key.

  • Requires all cells in tables to have single values.

  • Move attributes that could have multiple values to separate table.

  • Enforces various data integrity constraints.

  • Create primary key to connect new table to old table.

Second normal form (2NF)

  • 1NF plus removing table’s partial dependencies.

  • All attributes must depend on entire primary key.

Third normal form (3NF):

  • 2NF plus removing all transitive dependencies.

  • Attributes may only depend on primary key.

Star schema:

  • Post-relational data model.

  • More generalized data model than relational data model.

  • Most common data model in data warehouses.

  • Unnormalized data.

  • Consists of at least one fact table that references any number of dimension tables.

Fact table:

  • Records measurements for specific events.

  • Usually consists of numeric values and foreign key.

  • Foreign key links to dimensional table containing descriptive information for numeric values.

  • Records measurements at low level of detail or granularity.

  • Results in accumulation of many records over time.

  • Three types: Transaction, snapshot, and accumulating snapshot fact tables.

Dimension table:

  • Has smaller number of records compared to fact table

  • Number of attributes in their records can be very large.

  • Defines many types of dimensions:
    Time: (most common) Describe time at which events are recorded in fact table.
    Range: Describe range of measurable quantities to simplify reporting.
    Other: Tables for employees, geography, and products.

Comparison:

Advantages of normalized relational database versus star schema:

  • Strictly reinforces data integrity to prevent anomalies like one-off inserts and updates.

  • Greater flexibility in performing analytical tasks if they follow database model’s business logic.

  • More easily supports many-to-many relationships between business entities.

Benefits of star schema versus normalized relational database:

Simpler reporting logic, especially as-of and period-over-period reporting.

Simpler join-logic:

  • Improves performance on read-only operations like reporting and queries.

  • Improves performance of aggregration operations.

  • Efficiently build proprietary cubs for online analytical processing (OLAP) systems.

Model Behavior: An Introduction to Data Models

A data model organizes data elements and standardizes their relation to each other and the properties of real-world entities. A database model is a type of data model that determines a database’s logical structure how data can be stored, organized and manipulated in that database. A normalized relational database and unnormalized star schema are two of the most common database models, each of which has its advantages.

ER/Studio Data Architect enables you to efficiently catalog your current data assets and sources across different platforms and track end-to-end data lineage. Simplify your data architecture with a common language leveraging consistent naming standards and data definitions. Easily specify the sensitive data objects that need heightened protection, to withstand audit scrutiny. Learn more →.

Share This
Contact IDERA: