Business success is directly related to daily decisions. At all levels and in all functional blocks – from material resources management and production planning to sales and cash flow control. Reliable data are needed to make optimal decisions based on the current state of the company and the external environment.
The explosive increase in the availability of information from various sources has created a situation characterized by two points for business: an excess of data and a lack of relevant information. The corporate data warehouse is one of the critical solutions to solve this problem. So, what is the enterprise data warehouse, and where do we use this.
Introduction to Enterprise Data Warehouse
An enterprise data warehouse – EDW is a specially organized array of enterprise or organization data processed and stored in a single hardware-software complex that provides:
Thus, the EDW’s meaning is the data warehouse is an extensive subject-oriented corporate information EDW database, specially designed and designed to prepare reports, analyze business processes to support decision-making in the organization.
The specificity of the corporate data warehouse is that it converts data, metadata from heterogeneous sources and provides it to users of analytical systems as a single version of the truth. EDW is built based on client-server architecture, relational DBMS, and decision support utilities.
Data Warehouse Models
All data warehouses can be divided into two broad categories:
It should be noted that in practice, no data warehouse exactly matches one or another ideal model.
In normalized storages, data is in subject-oriented tables of the third or more standard forms. In practice, there are six standard forms in which tuples of the database relation can be found: when moving to the next standard form, the properties of the previous typical structures are preserved.
Key features and enterprise data warehouse architecture
Enterprise Data Warehouse can be used for different goals. The main functions are:
Depending on the priorities in the execution of the above essential functions, it is necessary to choose the architecture of EDW.
William Inmon, the founder of a new direction in the development of database/SD technology, in 1990, defined the information storage as a specially administered database, the contents of which have the following properties:
In terms of subject orientation, unlike databases in traditional OLTP systems, where the data is selected following specific applications, the information in the database is focused on decision support tasks. A decision support system requires “historical” data that reflects the development of all areas of the company’s activities over time.
Integration of data consists in the fact that the information storage data coming from various sources, where they can have different names, attributes, measurement units, and encoding methods, are cleared and combined after loading into the SD, from that moment they are presented to the user as a single information space not only in terms of names and formats but also in terms of coding.
The time invariance of data in the data store is achieved by introducing fields with the attribute “time” (day, week, month) in the keys of the tables. As a result, the entries in the SD tables never change, representing snapshots of data taken at specific times. The data store contains “data snapshots.” Each element in its key explicitly or indirectly stores a time parameter, for example, day, month, or year.
Enterprise Data Warehouse Elements
To understand what EDW is more clearly, we offer to look at its components. In general, it consists of the following elements:
WCD includes primarily three repositories:
The need for a data repository became undeniable after a series of unsuccessful attempts to create virtual hard drives. In this architecture, the client program directly received data from sources, transforming them immediately. The simplicity of the architecture compensated for the latency of query execution and data conversion.
Since the result of the query was not saved, a repeated question with the same or similar parameters required a second data conversion, which led to the abandonment of virtual storage and the creation of data repositories.
Dictionaries of data sources are automatically included in the metadata repository. It also stores data formats for their subsequent coordination, the frequency of replenishment of data, consistency in time.
About the principle of ETL
In practice, ETL acts as an intermediate layer between OLTP and OLAP systems. OLTP (Online Transaction Processing) is a transactional system for processing a continuous stream of small-sized transactions in real-time: ERP-, MES-, banking, and exchange applications. They automate large-scale, structured, repetitive data processing tasks, such as order entry and bank transactions, for short periods. However, complex analytical queries, for example, how many goods of the category “handicrafts” were bought by young people aged 18-30 in cities with a population over one million over the past three years, have been carried out in such systems for a very long time.
OLAP systems are designed for such requests. OLAP (Online Analytical Processing) is interactive analytical processing, preparation of summary (aggregated) information based on extensive data arrays structured according to the multidimensional principle. At the same time, a complex data structure is constructed – an OLAP cube, which includes a fact table for which key queries and aggregate (measurement) tables are made, showing how aggregated data can be analyzed.
Thus, the main functions of an ETL system can be represented as a sequence of operations for transferring data from OLTP to OLAP:
Benefits from EDW
The introduction of enterprise data warehouses gives businesses many different benefits:
Also, it substantially reduces the time to backup and restores the database after accidents/failures, and total cost of ownership decreases due to:
Thus, business tasks are more effectively solved through the use of the extended EDW functionality (indexed full-text document search, audit, web-interface, etc.).
Why do enterprises need EDW?
The need for enterprise Data Warehouse was formed around the 90s of the last century when various information systems began to be actively used in the enterprise sector to take into account many business indicators. Each such application successfully solved the task of automating the social production process, for example, performing accounting calculations, conducting transactions, HR analytics, etc.
Ok, how beneficial is it for a business to implement an enterprise-wide data warehouse? Practice shows that EDW can optimize data storage and management. What does a company get with the help of EDW :
Besides, in addition to the services for developing a corporate data warehouse, licenses are sold for both server-side software (OS, database, application server, etc.) and client places for anti-virus protection and security tools.
But there’s one caveat. Intending to EDW data warehousing, a potential client does not always understand how it will provide redundancy. Often, the client’s backup systems are not able to simultaneously connect data volumes from 20-30 TB to the backup. As a rule, specialists and users of the client require training courses.
Let’s sum up the enterprise data warehouse definition. EDW is a solution for storing, combining and processing large amounts of data from various source systems (accounting systems, transaction systems). EDW data warehouse provides quick access to data through storefronts, ensuring the completeness, reliability, and consistency of data.
EDW is designed to store versatile information about the company’s business activities. It is usually built on the principle of a “single version of the truth” and provides access to a wide range of data for business users of the company. The EDW includes the following features and functionality:
For practical purposes, the enterprise warehouse often lays down the functions of transforming the chart of accounts, for example, for reporting standards.
The benefits of using enterprise data warehousing can be:
Labor costs are reduced and reporting development time is reduced, as All the necessary information is available in the EDW data storefronts.