Key Data Warehouse Terms

For a comparison of data warehousing to databases, data lakes, etc., please see Demystify Data Warehouse Basic Concepts.
Term Definition
AggregationRefers to the summing up of facts in selected dimensions from the original fact table (see “fact table” below).
Artificial intelligence (AI)Refers to the development of computer systems that can perform tasks normally performed by humans (i.e., visual perception, speech recognition, decision making, language translation). See “deep learning” and “machine learning.”
AnalyticsRefers to the process of finding meaningful patterns in data sets.
AttributeRefers to a single field of information in a dimension (see “dimension” below), such as a client ID number.
Big dataRefers to the size and complexity of data sets that can’t be adequately handled by traditional databases.
Business intelligence (BI)Refers to using software and services to turn data into actionable insights to inform an organization’s strategic decisions.
Common table expressionsRefers to the result set of a query that temporarily exists inside a larger query, for the purpose of deconstructing queries into reusable blocks.
Data architectureRefers to the way data gets processed and stored, including the data models (see “data modeling” below).
Data enrichmentRefers to enhancing, refining, or otherwise improving raw data (i.e., misspellings or typographical errors) with precision algorithms.
Data ingestion Refers to the process of collecting and loading data into a database.
Data migration Refers to the process of permanently transferring data from one computer storage system to another.
Data mining Refers to searching for hidden patterns of data, from different perspectives.
Data modelingRefers to the frameworks used within information systems to store and manage data for consistent usage.
Deep learningRefers to a subset of machine learning (see “machine learning” below) that mimics the complexity of human neural networks, to mimic skills that come intuitively to humans.
DeduplicationRefers to the use of machine learning (see “machine learning” below) to eliminate redundant data.
Data pipeline Refers to the process of extracting data from various sources in an automated way.
Data visualizationRefers to graphically presenting information to give readers a deeper, visual understanding.
DimensionRefers to a category of information, like personally identifiable information (PII).
Dimensional modelRefers to the data modeling, or how the data is organized.
Drill across/down/through/upRefers to data analysis using metaphorical directions:
Across refers to dimensions,
Down refers to a child attribute,
Through refers to displaying another aspect of possibly relevant data with a pop-up chart, and
Up refers to a parent attribute.
ETLOr "Extract, Transform, Load" is the process by which data exits and enters the data warehouse.
Fact tableRefers to a table type, typically including two types of columns: fact columns and foreign keys to the dimensions. Facts are the performance measurements from business events, such as sales amount, client enrollments, cost of medical procedures, and so on.
Machine learningIs a subset of artificial intelligence (AI) that improves performance based on iterations of data processing and building systems to capture performance. It’s widely used by large enterprises today. AI is not always machine learning, but machine learning is always AI.
OLAPStands for “online analytical processing,” and is a data cube, referring to a data architecture that is built from tables in a database that has calculations.
Operational data store (ODS) Refers to a source of data that is often used as a temporary staging area (see “staging area” below) to upload into a conventional, modern data warehouse. ODS data is cleaned and validated but often isn’t very historically deep.
Relational databaseRefers to a set of data tables with columns and rows that support full SQL (see “SQL” below.)
SchemaRefers to a collection of database objects like tables, views, indexes, and synonyms that form the dimensional model (see “dimensional model”) of a data warehouse.
Spread-martRefers to a massive workbook filled with dozens or hundreds of spreadsheets in an attempt to make them reporting applications. Essentially, it is a data mart with less-flexible data formats.
SQLRefers to "standard query language," or the computing language needed to get an answer regarding a query from a database.
Staging area Refers to a simplified consolidation and cleansing of operational data coming from multiple sources.
Tabular moduleRefers to a set of metadata-like tables, measures, calculation groups, translations, and other elements that run in-memory or in DirectQuery mode, connecting data from back-end relational data.

We’re Here to Help

For more information about how our ClientInsight platform can help you automate your data quality checks, account for your successes, and gain greater vision and insight into your data, schedule a demo and our experts will reach out to you.

Get Started

Contact us today to learn how Eccovia can help you achieve improved outcomes for the people your organization serves.