Flow Helps You Build a Unified Analytics Framework

A Unified Analytics Framework, or UAF, is an architecture designed to centralize and standardize the process of transforming raw data into actionable information distributed across an organization.

SEE A DEMO

What is the Unified Analytics Framework?

A Unified Analytics Framework is an architecture designed to centralize and standardize the process of transforming raw data into actionable information distributed across an organization.
Decentralized and fragmented approaches to data and engineering governance often lead to inefficiencies and hinder effective decision-making. The UAF addresses these challenges by providing a centralized system for managing business rules, transforming raw data into actionable information, and ensuring consistent data quality. It standardizes the entire process from raw data to information, incorporating rules around data transformation, distribution, access, and governance. This comprehensive framework eliminates data silos, enhances data quality through robust cleansing and validation processes, and ensures that information is managed consistently across the organization.

The UAF also serves as an auditable single source of truth, logging all changes, versioning calculations, and maintaining a complete dependency map of all model components. This rigorous governance framework empowers citizen developers, particularly within the engineering core, by providing tools to leverage deep process knowledge and transform it into valuable data streams accessible by the entire enterprise. By combining the expertise of many operational subject matter experts, the UAF facilitates the creation of comprehensive and context-rich data models. This enables rapid, informed decision-making and fosters continuous improvement and innovation, ultimately driving operational efficiency and effectiveness.

Necessary components of the UAF

Building a Unified Analytics Framework (UAF) requires uniting three core components to work together in the transformation, management, and distribution of data across the entire organization.
  • Information Model - Defines and governs how the key data for the organization is created, managed, and distributed ensuring consistency and context.
  • Execution Engines - A method for executing and storing the calcs, KPIs, and events defined within the information model, in addition to centralizing data processing, real-time notifications, and the automation of data flows.
  • Information Gateway - Unifies the access to both raw and transformed data as well as all definitions held within the information model via a single, queryable access point.

1. The Information Model

An information model defines the key data an organization focuses on and establishes governance for how that data is created, managed, and used. It includes the rules for transforming data into useful information as well as the rules for distributing it. Utilizing such a model provides many benefits, including:

Decoupled from platforms

The Information Model in the UAF is designed to be uncoupled from any specific data source or application, allowing for flexibility and scalability. This ensures that data from multiple sources can be seamlessly integrated, transformed, and governed without dependencies, providing a unified and consistent view across the organization.

Abstracted namespaces

By abstracting data from various sources, the UAF creates a cohesive and unified view. This abstraction layer simplifies data management by concealing the complexities of underlying systems, enabling users to interact with data through a standardized interface, and ensuring consistency and reliability across the entire organization.

Structured but flexible

The UAF offers a structured framework for data management giving organizations the freedom to follow existing standards like ISA 95 or build custom solutions that work best for their unique requirements. This balance between structure and flexibility enables efficient data operations and supports continuous improvement and innovation.

Templatized deployments

Leveraging predefined templates, the UAF streamlines data management and governance. This templatized approach allows for quick deployment of consistent data models and standardized rules and transformations. It also enables the extension of previously completed work, creating base templates that can be further divided into various template subsets. This reduces the need for custom coding, accelerates implementation, ensures scalability, and makes it easier to maintain and extend the data infrastructure.

Contextualized information

In manufacturing, context often extends beyond immediate process parameters and machine statuses. The UAF enriches data with comprehensive contextual layers, including temporal data (timestamps and shifts), spatial information (locations and relationships), event logs (alarms and key events), and product details (batch and specifications). This broader scope ensures that data is not only accurate but also actionable, leading to deeper insights and more informed decision-making across the organization.

2. Execution Engines

The Execution Engines in the UAF are specialized components designed to centralize data processing, provide real-time notifications, and automate data flows within manufacturing environments. These engines ensure that data is handled efficiently and accurately, enabling seamless integration and distribution of information across the organization.

Backfilling

The ability to leverage years of data stored in time series historians as well as transactional records in SQL databases is a key consideration when selecting a data engine. Deploying an information model with backfilling capabilities means you can immediately validate your expressions and ensure that your calculations are providing the results you expected. This becomes crucial as you train models and look towards advanced analytics.

Data Processing

Manufacturing data often changes, arrives late, and needs to be versioned. If engineers and the enterprise can't trust that the data is accurate or up to date, they won't use it. For that reason, the Data Engine must be capable of automatically handling late or modified data values, version changes, and dependencies. It ensures data integrity by backfilling and rerunning calculations, creating a reliable and consistent data foundation crucial for confident decision-making.

Resist Replicating Data

Storing only new insights or results is paramount to a successful UAF strategy. Data that has already been validated and is stored in other databases should be left there. Rather, only the new insights, that is events, KPIs, and the contextualized meta data around them should be stored. Raw data should only be called from the original sources as the engines require it for processing.

Messaging

The Messaging Engine should be used to integrate data into the organization's existing notification and communication tools, such as email, SMS, Microsoft Teams, and Slack. By seamlessly delivering real-time notifications and updates through these familiar platforms, it ensures that stakeholders receive critical information in a timely and efficient manner, enhancing communication and responsiveness across the organization.

Data Streaming

This engine is used to automate, on schedule or on trigger, the streaming of data to various databases, data lakes, and BI tools, either on a triggered or scheduled basis. By matching the schema of target systems, it facilitates seamless data integration, ensuring that all enterprise systems are synchronized with the latest information.

3. Information Gateway

Unifies access to both raw and transformed data as well as all definitions held within the Information Model via a single, queryable access point. Ideally this component supports multiple API technologies, such as Graph and REST, and may integrate SQL support to ensure flexibility and compatibility with various applications and systems.

Centralized data access

Information Gateway offers a single point of entry to all connected data sources, simplifying the way users interact with the Information Model. This centralized access streamlines data retrieval and integration processes, making it easier to work with complex data sets.

Registry of underlying databases

The Information Gateway includes a registry for data silos, allowing users to query any underlying databases attached to the Information Model via the UAF without needing to understand the structure or nature of those databases. This simplifies access to disparate data sources and enhances data integration capabilities.

Empowers citizen developers

Helps users of all abilities to create data-driven applications, perform analyses, and generate insights without needing deep technical expertise or knowledge of multiple systems and databases. By lowering barriers to data access, it fosters innovation and agility, enabling more team members to contribute.

HOW FLOW WORKS

Data Sources and Data Consumers

Building a Unified Analytics Framework (UAF) requires a comprehensive approach to ensure cross-platform compatibility and freedom from vendor lock-in. This involves integrating and abstracting a wide variety of manufacturing systems, databases, and modern solutions.

A robust UAF must be designed to seamlessly integrate with a multitude of data sources and consumers, ensuring that the entire manufacturing environment can operate efficiently and cohesively. This involves connecting to various types of manufacturing systems and databases, enabling the organization to unify data from disparate sources, and delivering actionable insights to a wide range of systems.

Key Data Sources

Time Series Historians

The UAF must account for a variety of historian vendors and technologies to ensure cross-platform compatibility and free the enterprise from historian vendor lock-in. This includes supporting:

  • Enterprise Class Historians - AVEVA PI, Canary Labs, etc
  • Site Historians - Wonderware, Proficy, Citect, FactoryTalk, DeltaV, etc.
  • Open Source Historians - InfluxDB, Timescale, QuestDB, etc.
  • SQL-based Historians - Ignition, VTScada, etc.

Manufacturing Systems and Solutions

Unifying data from these diverse systems necessitates the ability to connect to and pull data from each of them. This integration must encompass both established and legacy providers, as well as modern solutions, to offer comprehensive and flexible data management and operational capabilities. By doing so, organizations can achieve seamless data integration and efficient operations across their entire manufacturing environment.

  • Laboratory Information Management Systems (LIMS) - STARLIMS, LabWare, Thermo Fisher Scientific SampleManager, etc.
  • Manufacturing Execution Systems (MES) - Siemens Opcenter, Rockwell Automation FactoryTalk, AVEVA MES, Sepasoft, etc.
  • Enterprise Resource Planning (ERP) - SAP, Oracle, etc.
  • Computerized Maintenance Management Systems (CMMS) - Fiix CMMS, UpKeep, Maintenance Connection, etc.
  • Enterprise Asset Management (EAM) - IBM Maximo, Infor, SAP, etc.

Real Time Data Capture

In many manufacturing environments, crucial real-time data is not currently being archived, which limits the ability to analyze and transform this information. To address this gap, the UAF should include the capability to collect and store real-time data, playing a role similar to a traditional data historian but specifically for data that is not already being stored. When connecting to a Unified Namespace (UNS) as a real-time source, it is essential to understand what data is already being historized elsewhere. By identifying and capturing only the data that is not currently stored, the UAF ensures that all relevant information is available for comprehensive analysis and transformation, thereby enhancing decision-making and operational efficiency. This approach prevents redundancy, optimizes storage, and ensures a complete and accurate data set for actionable insights.

Common technologies and protocols that should be supported include OPC servers, MQTT brokers (vanilla and Sparkplug), web APIs, Kafka streams, and other real time data sources.

SQL Databases

Transaction-based data is crucial in providing the necessary context to slice and interpret time series data effectively. This type of data helps in understanding the events and transactions that occur within the manufacturing process, offering a detailed view of operational activities and their impact. Incorporating transaction-based data into the UAF allows for a more comprehensive analysis, enabling better decision-making and insights.To support this integration, the UAF must be capable of connecting to and pulling data from various SQL databases. This includes widely-used technologies such as:

  • Microsoft SQL Server - An open-source database that is popular for its performance, reliability, and ease of use.
  • MYSQL - Siemens Opcenter, Rockwell Automation FactoryTalk, AVEVA MES, Sepasoft, etc.
  • PostgreSQL - An advanced open-source database that supports complex queries and a wide range of data types.
  • Oracle DB - Renowned for its advanced features, scalability, and strong security measures.

Manual Data Capture

Entering data, categorizing events, and capturing comments and context is vital in ensuring that all relevant information is included in the UAF, especially when certain data points are not automatically collected by sensors or systems. This type of data often includes critical insights from human observations, quality checks, or maintenance activities that provide additional context and depth to the automated data streams.

To effectively integrate manual data, the UAF should support various methods for capturing and categorizing this information. This can be achieved through web forms, mobile applications, and by importing CSV files.

Major Data Consumers

The value of data already transformed and heavily encoded with operational process knowledge and context cannot be overstated. For enterprises, such data is a gold mine, offering rich insights and actionable intelligence. The UAF ensures that this data is not only accessible but also integrated seamlessly into various systems, saving data teams significant time and effort. This feature is especially crucial for data teams overseeing multiple sites, as it enables them to make informed decisions quickly and efficiently, leveraging a unified, context-rich dataset. By structuring the data to match the existing schemas of these systems, the UAF facilitates smooth integration and usability.

Data lakes and warehouses

Standard connectivity methods will ensure that existing enterprise architectures and data strategies are met. This means ensuring solutions like AWS, Azure, Snowflake, Databricks, Google BigQuery, and Oracle can be fed data streams from the UAF.

Business Intelligence tools

Supports integration with standard BI tools such as PowerBI and Tableau to facilitate data visualization and business analytics.

Advanced analytics and ML/AI

Connects to advanced analytics platforms and machine learning and AI tools through standard methods, creating a plug-and-play environment with structured, contextualized, and even preprocessed data in wide table formats with normalized timestamps, greatly increasing the time to value for these projects.

Back to the UNS

Publishes processed and contextualized data back to the Unified Namespace (UNS) to maintain a continuous and updated data flow, ensuring the integrity and accuracy of the entire data ecosystem.

HOW FLOW WORKS

Integrating a Unified Namespace

The UAF builds on the foundation provided by the UNS by governing the transformation and additional contextualization of the data collected. The UNS acts as a real-time data bus that aggregates data from various sources, offering a cohesive view of the entire operation. It ensures seamless data flow and interoperability between disparate systems by continuously updating with the latest data from sensors, PLCs, SCADA systems, and historians. However, the UNS provides raw data without context. Through its Information Model, the UAF defines key data points, relationships, and rules for transforming raw data into actionable insights. The Execution Engines of the UAF process this data in real-time, handle notifications, and automate data flows, ensuring that data is correctly contextualized and enriched with operational insights. The API Access component of the UAF provides a single endpoint for accessing both raw and processed data, facilitating easy integration and data sharing across the enterprise. Together, the UNS and UAF enable manufacturers to achieve enhanced visibility, efficiency, and decision-making capabilities, driving continuous improvement and operational excellence.
The UAF builds on the foundation provided by the UNS by governing the transformation and additional contextualization of the data collected.

Aligning IT and OT

Deploying and expanding a Unified Analytics Framework (UAF) begins with empowering the Operations team, as their involvement is crucial for success. The primary goal is to simplify their tasks. Start by identifying an existing data and integration problem that Operations needs to solve. Use the UAF to address this problem, showcasing the framework's benefits and ease of use. This initial success will serve as a proof of concept and help secure buy-in from the Operations team.

This "land and expand" strategy ensures that by adding value to Operations, they will quickly identify other use cases for the UAF. As Operations expands the information model, more detailed and comprehensive data becomes available to the entire enterprise. This iterative approach not only solves immediate operational challenges but also builds a solid foundation for broader data integration and governance. By continuously addressing real-world problems and incrementally expanding the UAF, the organization can achieve a comprehensive view of its operations, driving continuous improvement and innovation.

However, many IT departments argue that their existing data lakes or BI tools already handle data integration and analysis. These architectures often overlook the critical component of Operations Technology (OT). Without direct OT involvement, several issues arise, including incomplete, inaccurate, or delayed data. OT professionals possess intimate knowledge of the data and its context, yet they are typically excluded from data integration and analysis processes. This exclusion leads to significant gaps in the data's relevance and utility for operational needs.

When IT attempts to contextualize data from operations independently, the lack of OT's detailed insights often results in suboptimal outcomes. OT personnel, unfamiliar with IT's tools and methodologies, find it impractical to rely on IT for timely and relevant insights, leading them to frequently resort to workarounds like using Excel for data analysis. These workarounds undermine the overall data strategy and highlight the need for a solution that prioritizes OT involvement from the start.

Involving OT from the beginning and using a UAF approach to feed enriched, context-rich data into data lakes or BI tools enhances their effectiveness. This strategy ensures these tools receive high-quality, contextualized data, making them more useful and relevant. The UAF approach not only supports IT's data strategies but also empowers OT by integrating their insights and needs into the data governance process. This comprehensive involvement of OT and IT leads to more accurate, timely, and relevant data insights, fostering a collaborative environment that drives continuous improvement and operational efficiency across the enterprise. Additionally, the UAF serves as a conduit for bringing enterprise-level insights back to the site level, ensuring that valuable information flows both ways and benefits the entire organization.

Origins of the Unified Analytics Framework

The Unified Analytics Framework (UAF) was conceived by industry leaders Graeme Welton, Leonard Smit, Walker Reynolds, Allen Ray, and Jeff Knepper, with contributions from manufacturing user groups like IntegrateLive! and the Industry 4.0 community. They recognized a common issue: fragmented data systems and inefficiencies due to disconnected data silos, particularly the spread of data transformation processes across the application layer, making them impossible to govern. Determined to centralize integration and transformation work, they set out to create a framework that would unify data sources while prioritizing OT's requirements.

Through collaborative efforts, discussions, and workshops, the concept of the UAF took shape. The aim was to develop a framework that could integrate, contextualize, and govern data, providing a single, auditable source of truth. The UAF was designed to make OT's job easier and ensure reliable data governance through its ability to log changes, version calculations, and maintain a full dependency map.

The leadership at CESMII, including John Dyck, Jonathan Wise, and John Louka, played a key role in highlighting the problems facing manufacturers and the need for centralized governance, helping to shape the understanding and necessity of the UAF. Today, the UAF continues to evolve with ongoing contributions from the manufacturing community, delivering improved decision-making and operational efficiency across the industry.