Is your data warehouse a mud pit of technical debt that your organisation is crawling through to try to catch up, or are you running in the wide open spaces of the new agile world of analytics? Information is the lifeblood of many organisations, yet most organisations are failing to derive the full value of the ever-increasing information available. Slowly but surely there has been an evolution or revolution in how organisations are leveraging their information. Instead of using information to provide a rear window view of behaviour and performance, new architectures and approaches represent a significant opportunity for an organisation to drive the business through real-time information and analytics, to achieve better business outcomes.
Traditional big bang data warehousing was slow and expensive to deliver, leaving a significant amount of technical debt that needed to be fed and watered. The data warehouse often missed the mark with the business due to rigid waterfall based requirement approaches, and also didn’t provide the capability to drive the business through analytics/machine learning and other advanced capabilities.
The other significant challenge for organisations is the application landscape that continues to evolve at pace, with SAAS, cloud-based solutions becoming more common. Traditional data warehouses have failed to keep up with the dynamic business and technical environments.
Business intelligence projects have often failed to deliver what the business “needs”, because they are focused on delivering what they thought the business “asked for”. There is an incorrect assumption that business users can correctly articulate in terms of their information and capability requirements. Using agile methodologies addresses this by putting the business in the driving seat and utilises an iterative development process that moves towards the solution in a series of rapid steps. However, in order to support the agile rapid development process, a different architecture is required that allows a fast turnaround of ingesting information and allowing rapid delivery to the business via, Analytics/Visualisations as appropriate. Organisations also need to achieve this agility with appropriate governance and controls. This is an area of massive evolution and revolution, driven mostly by open source initiatives developed by tech companies who drive business value through analytics.
Technologies that support rapid delivery
Big Data Solutions: Hadoop along with the large number of complementary open source projects, provide a broad technology stack with a huge amount of capability. It is designed for rapid ingestion of all types of information, which is then modeled on use, via user-friendly data wrangling tools. This repository, often described as a “data lake”, is great for Analytics as it supports data scientists and data specialists to gain insight, quicker, to a richer set of information providing a broader set of analytical outcomes.
Pro’s: Can be deployed in a small footprint, while allowing the platform to be scaled as demand and usage increases. If organisations are comfortable with open source, there are a wide range of micro service capabilities such as supporting real-time analytics with event processing.
Con’s: Can require significant resources to support and manage, particularly with heavy usage. Queries against native Hadoop storage can be slow, requiring the use of faster data store.
Data Virtualisation: Data virtualisation can provide significant efficiencies in the development effort, as rather than building ETL code, virtual data structures are developed that can be rolled out to users rapidly.
- Pro’s: Can be deployed in a small footprint, while allowing the platform to be scaled as demand and usage increases. If organisations are comfortable with open source, there are a wide range of micro service capabilities such as supporting real-time analytics with event processing.
- Con’s: Can require significant resources to support and manage, particularly with heavy usage. Queries against native Hadoop storage can be slow, requiring the use of faster data store.
Data Virtualisation: Data virtualisation can provide significant efficiencies in the development effort, as rather than building ETL code, virtual data structures are developed that can be rolled out to users rapidly.
- Pro’s: Fast to deliver
- Cons: Can struggle in terms of query performance for some types of queries, does not have the wide range of capability that big data solutions can provide.
Both of these technologies can enhance existing data warehouses by allowing new data sources to be added at pace so the business can start driving value from the information faster. Initially this might be for Analytics, Operational or targeted management reporting. Longer term these information sources can be modeled into traditional data warehouse structures and be made available for organisational wide reporting as required by the business.
Both of these technologies can enhance existing data warehouses by allowing new data sources to be added at pace so the business can start driving value from the information faster. Initially this might be for Analytics, Operational or targeted management reporting. Longer term these information sources can be modeled into traditional data warehouse structures and be made available for organisational wide reporting as required by the business.