Data Catalog for Gold Layer

Overview

The Gold Layer is the business-level data representation, structured to support analytical and reporting use cases. It consists of dimension tables and fact tables for specific business metrics.


1. gold.dim_customers

  • Purpose: Stores customer details enriched with demographic and geographic data.
  • Columns:
Column Name Data Type Description
customer_skey INT Surrogate key uniquely identifying each customer record in the dimension table.
customer_id INT Unique numerical identifier assigned to each customer.
customer_key VARCHAR(50) Alphanumeric identifier representing the customer, used for tracking and referencing.
customer_first_name VARCHAR(50) The customer’s first name, as recorded in the system.
customer_last_name VARCHAR(50) The customer’s last name or family name.
customer_birthdate DATE The date of birth of the customer, formatted as YYYY-MM-DD (e.g., 1971-10-06).
customer_gender VARCHAR(50) The gender of the customer (e.g., ‘Male’, ‘Female’, ‘n/a’).
customer_marital_status VARCHAR(50) The marital status of the customer (e.g., ‘Married’, ‘Single’).
customer_country VARCHAR(50) The country of residence for the customer (e.g., ‘Australia’).
create_date DATE The date and time when the customer record was created in the system

2. gold.dim_products

  • Purpose: Provides information about the products and their attributes.
  • Columns:
Column Name Data Type Description
product_skey INT Surrogate key uniquely identifying each product record in the product dimension table.
product_id INT A unique identifier assigned to the product for internal tracking and referencing.
product_key VARCHAR(50) A structured alphanumeric code representing the product, often used for categorization or inventory.
product_name VARCHAR(50) Descriptive name of the product, including key details such as type, color, and size.
product_category_id VARCHAR(50) A unique identifier for the product’s category, linking to its high-level classification.
product_category VARCHAR(50) The broader classification of the product (e.g., Bikes, Components) to group related items.
product_subcategory VARCHAR(50) A more detailed classification of the product within the category, such as product type.
product_maintenance_flag VARCHAR(50) Indicates whether the product requires maintenance (e.g., ‘Yes’, ‘No’).
product_cost INT The cost or base price of the product, measured in monetary units.
product_line VARCHAR(50) The specific product line or series to which the product belongs (e.g., Road, Mountain).
product_start_date DATE The date when the product became available for sale or use, stored in

3. gold.fact_sales

  • Purpose: Stores transactional sales data for analytical purposes.
  • Columns:
Column Name Data Type Description
sales_details_skey CHAR(32) Derived surrogate key using md5 hash function using sales_order_number, product_key, customer_key. 32 characters long.
sales_order_number VARCHAR(50) A unique alphanumeric identifier for each sales order (e.g., ‘SO54496’).
product_key INT Surrogate key linking the order to the product dimension table.
customer_key INT Surrogate key linking the order to the customer dimension table.
sales_order_date DATE The date when the order was placed.
sales_shipping_date DATE The date when the order was shipped to the customer.
sales_due_date DATE The date when the order payment was due.
sales_amount INT The total monetary value of the sale for the line item, in whole currency units (e.g., 25).
sales_quantity INT The number of units of the product ordered for the line item (e.g., 1).
sales_price INT The price per unit of the product for the line item, in whole currency units (e.g., 25).

Want to support my blog?

Buy Me A Coffee