Data
Warehouse
Glossary
Key Data Warehouse Terms
For a comparison of data warehousing to databases, data lakes, etc., please see Demystify Data Warehouse Basic Concepts.
Term | Definition |
---|---|
Aggregation | Refers to the summing up of facts in selected dimensions from the original fact table (see “fact table” below). |
Artificial intelligence (AI) | Refers to the development of computer systems that can perform tasks normally performed by humans (i.e., visual perception, speech recognition, decision making, language translation). See “deep learning” and “machine learning.” |
Analytics | Refers to the process of finding meaningful patterns in data sets. |
Attribute | Refers to a single field of information in a dimension (see “dimension” below), such as a client ID number. |
Big data | Refers to the size and complexity of data sets that can’t be adequately handled by traditional databases. |
Business intelligence (BI) | Refers to using software and services to turn data into actionable insights to inform an organization’s strategic decisions. |
Common table expressions | Refers to the result set of a query that temporarily exists inside a larger query, for the purpose of deconstructing queries into reusable blocks. |
Data architecture | Refers to the way data gets processed and stored, including the data models (see “data modeling” below). |
Data enrichment | Refers to enhancing, refining, or otherwise improving raw data (i.e., misspellings or typographical errors) with precision algorithms. |
Data ingestion | Refers to the process of collecting and loading data into a database. |
Data migration | Refers to the process of permanently transferring data from one computer storage system to another. |
Data mining | Refers to searching for hidden patterns of data, from different perspectives. |
Data modeling | Refers to the frameworks used within information systems to store and manage data for consistent usage. |
Deep learning | Refers to a subset of machine learning (see “machine learning” below) that mimics the complexity of human neural networks, to mimic skills that come intuitively to humans. |
Deduplication | Refers to the use of machine learning (see “machine learning” below) to eliminate redundant data. |
Data pipeline | Refers to the process of extracting data from various sources in an automated way. |
Data visualization | Refers to graphically presenting information to give readers a deeper, visual understanding. |
Dimension | Refers to a category of information, like personally identifiable information (PII). |
Dimensional model | Refers to the data modeling, or how the data is organized. |
Drill across/down/through/up | Refers to data analysis using metaphorical directions: • Across refers to dimensions, • Down refers to a child attribute, • Through refers to displaying another aspect of possibly relevant data with a pop-up chart, and • Up refers to a parent attribute. |
ETL | Or "Extract, Transform, Load" is the process by which data exits and enters the data warehouse. |
Fact table | Refers to a table type, typically including two types of columns: fact columns and foreign keys to the dimensions. Facts are the performance measurements from business events, such as sales amount, client enrollments, cost of medical procedures, and so on. |
Machine learning | Is a subset of artificial intelligence (AI) that improves performance based on iterations of data processing and building systems to capture performance. It’s widely used by large enterprises today. AI is not always machine learning, but machine learning is always AI. |
OLAP | Stands for “online analytical processing,” and is a data cube, referring to a data architecture that is built from tables in a database that has calculations. |
Operational data store (ODS) | Refers to a source of data that is often used as a temporary staging area (see “staging area” below) to upload into a conventional, modern data warehouse. ODS data is cleaned and validated but often isn’t very historically deep. |
Relational database | Refers to a set of data tables with columns and rows that support full SQL (see “SQL” below.) |
Schema | Refers to a collection of database objects like tables, views, indexes, and synonyms that form the dimensional model (see “dimensional model”) of a data warehouse. |
Spread-mart | Refers to a massive workbook filled with dozens or hundreds of spreadsheets in an attempt to make them reporting applications. Essentially, it is a data mart with less-flexible data formats. |
SQL | Refers to "standard query language," or the computing language needed to get an answer regarding a query from a database. |
Staging area | Refers to a simplified consolidation and cleansing of operational data coming from multiple sources. |
Tabular module | Refers to a set of metadata-like tables, measures, calculation groups, translations, and other elements that run in-memory or in DirectQuery mode, connecting data from back-end relational data. |
We’re Here to Help
For more information about how our ClientInsight platform can help you automate your data quality checks, account for your successes, and gain greater vision and insight into your data, schedule a demo and our experts will reach out to you.
Get Started
By Phone
Contact us today to learn how Eccovia can help you achieve improved outcomes for the people your organization serves.
To call us right now, just click the phone icon below.
By Form
Please fill out the form below if you’d like
to speak
with one of our experts.
with one of our experts.