Reading time – 14min

May 25, 2020 | 14:05

Share

Business success is directly related to daily decisions. At all levels and in all functional blocks – from material resources management and production planning to sales and cash flow control. Reliable data are needed to make optimal decisions based on the current state of the company and the external environment.

The explosive increase in the availability of information from various sources has created a situation characterized by two points for business: an excess of data and a lack of relevant information. The corporate data warehouse is one of the critical solutions to solve this problem. So, what is the enterprise data warehouse, and where do we use this.

Introduction to Enterprise Data Warehouse

EDW

An enterprise data warehouse – EDW is a specially organized array of enterprise or organization data processed and stored in a single hardware-software complex that provides:

  • quick access to operational and historical information; multidimensional data analysis (KPI for various measurements)
  • obtaining forecasts and statistics in the context of agreed regulatory reference information (NSI).
  • Thus, the EDW’s meaning is the data warehouse is an extensive subject-oriented corporate information EDW database, specially designed and designed to prepare reports, analyze business processes to support decision-making in the organization.

    The specificity of the corporate data warehouse is that it converts data, metadata from heterogeneous sources and provides it to users of analytical systems as a single version of the truth. EDW is built based on client-server architecture, relational DBMS, and decision support utilities.

    Data Warehouse Models

    Data warehouse models
    warehouse group

    All data warehouses can be divided into two broad categories:

  • Normalized data warehouses (Inman model);
  • Dimensional storage (Kimball model).
  • It should be noted that in practice, no data warehouse exactly matches one or another ideal model.

    In normalized storages, data is in subject-oriented tables of the third or more standard forms. In practice, there are six standard forms in which tuples of the database relation can be found: when moving to the next standard form, the properties of the previous typical structures are preserved.

    Key features and enterprise data warehouse architecture

    Key features and enterprise data

    Enterprise Data Warehouse can be used for different goals. The main functions are:

  • Complete and timely collection and processing of information from data sources.
  • Reliable and secure data storage.
  • We are providing data for analytical work.
  • Depending on the priorities in the execution of the above essential functions, it is necessary to choose the architecture of EDW.

    William Inmon, the founder of a new direction in the development of database/SD technology, in 1990, defined the information storage as a specially administered database, the contents of which have the following properties:

  • Subject orientation.
  • Data integration.
  • Time invariance.
  • Indestructibility – stability of information.
  • Minimization of redundancy of information.
  • In terms of subject orientation, unlike databases in traditional OLTP systems, where the data is selected following specific applications, the information in the database is focused on decision support tasks. A decision support system requires “historical” data that reflects the development of all areas of the company’s activities over time.

    Integration of data consists in the fact that the information storage data coming from various sources, where they can have different names, attributes, measurement units, and encoding methods, are cleared and combined after loading into the SD, from that moment they are presented to the user as a single information space not only in terms of names and formats but also in terms of coding.

    The time invariance of data in the data store is achieved by introducing fields with the attribute “time” (day, week, month) in the keys of the tables. As a result, the entries in the SD tables never change, representing snapshots of data taken at specific times. The data store contains “data snapshots.” Each element in its key explicitly or indirectly stores a time parameter, for example, day, month, or year.

    Enterprise Data Warehouse Elements

    Enterprise Data Warehouse Elements

    To understand what EDW is more clearly, we offer to look at its components. In general, it consists of the following elements:

  • ETL-tools for extracting, converting, and loading data into a central data warehouse (WCD).
  • WCD designed and optimized for reliable and secure data storage.
  • Data marts provide sufficient user access to data that is stored in structures that are optimal for solving specific user problems.
  • WCD includes primarily three repositories:

  • reference information repository;
  • data repository;
  • metadata repository.
  • The need for a data repository became undeniable after a series of unsuccessful attempts to create virtual hard drives. In this architecture, the client program directly received data from sources, transforming them immediately. The simplicity of the architecture compensated for the latency of query execution and data conversion.

    Since the result of the query was not saved, a repeated question with the same or similar parameters required a second data conversion, which led to the abandonment of virtual storage and the creation of data repositories.

    Dictionaries of data sources are automatically included in the metadata repository. It also stores data formats for their subsequent coordination, the frequency of replenishment of data, consistency in time.

    About the principle of ETL

    ETL

    In practice, ETL acts as an intermediate layer between OLTP and OLAP systems. OLTP (Online Transaction Processing) is a transactional system for processing a continuous stream of small-sized transactions in real-time: ERP-, MES-, banking, and exchange applications. They automate large-scale, structured, repetitive data processing tasks, such as order entry and bank transactions, for short periods. However, complex analytical queries, for example, how many goods of the category “handicrafts” were bought by young people aged 18-30 in cities with a population over one million over the past three years, have been carried out in such systems for a very long time.

    OLAP systems are designed for such requests. OLAP (Online Analytical Processing) is interactive analytical processing, preparation of summary (aggregated) information based on extensive data arrays structured according to the multidimensional principle. At the same time, a complex data structure is constructed – an OLAP cube, which includes a fact table for which key queries and aggregate (measurement) tables are made, showing how aggregated data can be analyzed.

    Thus, the main functions of an ETL system can be represented as a sequence of operations for transferring data from OLTP to OLAP:

  • Loading Raw Data into an arbitrary quality ETL for further processing. In this case, the sum of the received rows is verified: if the source system has more rows than the Raw Data, then the download failed.
  • Validation of data, when the data is consistently checked for correctness and completeness, an error report is generated for correction.
  • Correspondence setting (mapping) of data with the target model, when columns are added to the validated table according to the number of directories of the target model, and then in each attached cell of each row, the correspondence of the values ​​of the target directories (1: 1, *: 1, 1: * or *: *).
  • Aggregation of data required due to the difference in data granularity in OLTP and OLAP systems. OLAP is a fully denormalized fact table and the surrounding reference tables in the asterisk or snowflake pattern. Moreover, the maximum granularity of OLAP amounts is equal to the number of permutations (aggregations) of all elements of all directories. An OLTP system may contain several amounts for the same set of directory entries. To trace from which OLTP lines the sum in the cell of the OLAP system was formed, mapping of the OLTP granularity is necessary, and then “gluing” the data in a separate table for loading into OLAP.
  • Upload to the target system using a connector and interface tools.
  • Benefits from EDW

    EDW benefits

    The introduction of enterprise data warehouses gives businesses many different benefits:

  • it allows you to organize structured storage of documents by the classification adopted by the organization.
  • the volume of business systems databases with which the EDW is integrated is significantly reduced by removing copies of electronic documentation from the source transaction systems;
  • Also, it substantially reduces the time to backup and restores the database after accidents/failures, and total cost of ownership decreases due to:

  • simplification of support, administration, and backup of the storage: administration and maintenance of the warehouse will be carried out in a single access point;
  • unification of the role-based data model for accessing the content in the repository will simplify the delimitation and access control;
  • Thus, business tasks are more effectively solved through the use of the extended EDW functionality (indexed full-text document search, audit, web-interface, etc.).

    Why do enterprises need EDW?

    EDW

    The need for enterprise Data Warehouse was formed around the 90s of the last century when various information systems began to be actively used in the enterprise sector to take into account many business indicators. Each such application successfully solved the task of automating the social production process, for example, performing accounting calculations, conducting transactions, HR analytics, etc.

    Ok, how beneficial is it for a business to implement an enterprise-wide data warehouse? Practice shows that EDW can optimize data storage and management. What does a company get with the help of EDW :

  • A unified business data warehouse for storing corporate data appears, which uses single reference information.
  • There is the possibility of a comprehensive business analysis. For example: which clients are the most valuable and profitable, which service, and which clients are the most demanded, what kind of claims are most frequent and in which regions, etc.
  • They are conducting analysis using historical data. Often, operating (automating daily business processes) systems do not allow this; they simply do not have enough space to store history and capacity for analysis.
  • Now it is possible to connect and analyze information previously stored in different information systems. For example, traffic data from various branches is stored in billing systems from different developers. After the introduction of CDs, it becomes possible to analyze them together in a single report.
  • Now it is possible to analyze and cross data of different kinds. For example, money and traffic, the number of personnel and the number of failures or claims, etc.
  • The basis for a better calculation of the cost of services is based on information from the corporate data warehouse, and you can get more adequate data for in-kind distribution databases.
  • Besides, in addition to the services for developing a corporate data warehouse, licenses are sold for both server-side software (OS, database, application server, etc.) and client places for anti-virus protection and security tools.

    But there’s one caveat. Intending to EDW data warehousing, a potential client does not always understand how it will provide redundancy. Often, the client’s backup systems are not able to simultaneously connect data volumes from 20-30 TB to the backup. As a rule, specialists and users of the client require training courses.

    To summarize

    Let’s sum up the enterprise data warehouse definition. EDW is a solution for storing, combining and processing large amounts of data from various source systems (accounting systems, transaction systems). EDW data warehouse provides quick access to data through storefronts, ensuring the completeness, reliability, and consistency of data.

    EDW is designed to store versatile information about the company’s business activities. It is usually built on the principle of a “single version of the truth” and provides access to a wide range of data for business users of the company. The EDW includes the following features and functionality:

  • Extract, convert and load data into EDW.
  • Data storage.
  • Data processing (calculation of indicators, aggregation, and filtering of data).
  • Providing enterprise data warehouse reporting to systems and external systems through data marts.
  • For practical purposes, the enterprise warehouse often lays down the functions of transforming the chart of accounts, for example, for reporting standards.

    The benefits of using enterprise data warehousing can be:

  • The introduction of EDW in the first place allows you to increase the efficiency of managerial decision-making by increasing the reliability of information and reducing the time it is received.
  • It becomes possible to quickly receive reports regularly in automatic mode and the required detail.
  • Labor costs are reduced and reporting development time is reduced, as All the necessary information is available in the EDW data storefronts.

    Tags

    #Custom Software#Logistic#Transportation
    Views: 256
    Interview popup form

    Get the expert's consultation on how to develop Startup.

    Sent!

    Thank you! We’ll be in touch soon.