Introduction to DeepCausality

DeepCausality Logo

This post was first published on Sept. 26 on the blog of the Linux Foundation for AI & Data under the title: Introduction to DeepCausality

The DeepCausality project was recently accepted into the Linux Foundation for AI & Data and, as the main author of the project, I want to use the occasion to share a brief introduction.

What is computational causality?

Although deep learning roots in statistics, popular deep learning frameworks such as TensorFlow or PyTorch shield developers from the underlying math. However, statistics uses correlation under the hood to map an input (say, a question) to an output (an answer). Contemporary deep learning has taken statistics one step further, but there are still certain limitations with a correlation-based foundation. For one, correlation leads to non-determinism because, just by chance, a variable can correlate with otherwise random values. Then, separating the signal from the noise requires very large data quantities. More fundamentally, deep learning requires that data used during training must follow the same distribution as the data the model encounters in production. If that is not the case, then deep learning is insufficient, and the model requires re-training. When the data distribution is either unstable or continuously shifting, deep learning falls short.

Computational causality, on the other hand, utilizes cause-and-effect relationships to go beyond correlation-based predictive models and toward AI systems that can prescribe actions more effectively and autonomously. At its core, causality-based reasoning is deterministic, meaning that the same set of input data feeds into a model and yields the same result, and this is very different from correlation-based deep learning, which may or may not give you a similar answer.

While computational causality has been researched since the 1980s, it was not until early 2020 that the industry started exploring its application. Netflix, for example, published in 2022 that they used causality in their recommendation engine. Unfortunately, Netflix secretly guarded its core computational causality technology behind corporate doors. Other companies, like Microsoft and Uber, contributed meaningful work to the field through their open source commitment. Currently, all publicly available work in computational causality relies on libraries written in Python. DeepCausality changes that and brings computational causality for the first time to Rust, the most loved programming language.

What is DeepCausality?

DeepCausality is a hypergeometric computational causality library for Rust that contributes:

  1. Contextual causal reasoning
  2. End-to-end explainability
  3. Causal State Machines

Contextual causal reasoning

Contemporary computational causality is context-free and active research focuses on causal discovery learning to find causal relations in data. Only very recently, leading researchers at Cambridge University started conceptualizing the addition of a temporal context to causality-based deep learning.

DeepCausality already has complete support for contextual causal models. Specifically, a context may be built from multiple data sources or live data streams. It can then be accessed from within the causal model, thus allowing efficient reasoning over contextualized data. Moreover, DeepCausality already fully supports up to four-dimensional contexts so that data scientists and ML engineers can freely model date, space, time, and space-time contexts allowing for rich contextualization of complex data analytics systems.

Contemporary computational causality relies on arithmetic and algebra as its foundation, which works well for formalization but has its limits when complexity grows. When a use case requires hundreds if not thousands of causal relations, then the arithmetic becomes increasingly complex. While arithmetic complexity is not an obstacle to academic research, it is to scalability. On the other hand, geometry comes with much simpler arithmetic but usually reaches its limits when structures become increasingly more complex; therefore, the actual challenge centers around simplifying structural complexity.

DeepCausality solves structural complexity with recursive isomorphic causal data structures that enable concise expression of arbitrary complex causal structures. That means, instead of relying on algebraic structures, causal models are expressed as graph networks over which DeepCausality reasons.

A causal hypergraph may contain any number of nodes with any number of relations to other nodes, with each node representing a cause. Furthermore, a node may contain a collection of causes or even another causal graph. That means a causal model represented as a graph may store other causal sub-models in each node of the causal graph, hence efficiently representing otherwise complex causal structures. DeepCausality can reason over a single cause in the graph, a selected sub-graph, or the graph itself.

Combined, a causal hypergraph and a context form the backbone of contextual causal reasoning in DeepCausality. Taken even further, multiple causal models in DeepCausality may share the same context but evaluate different aspects of it, hence allowing memory-efficient system designs. Contextual causal reasoning allows the exploration of new approaches to existing challenges. For example, transferable context structures become relevant for allowing the transfer of entire model groups when a context shared across all models can be transferred into a new area. Consequently, encoding contextual assumption then provides for the automatic search of novel applications in which a context and all dependent models can be transferred.

End-to-end explainability

Computational causality always supported explainability, and DeepCausality is no exception as it offers a built-in mechanism to understand the causal reasoning process: Graph explanation paths. Each cause has a built-in explain function that returns the string description of a cause and how it is evaluated, hence giving an explanation. For a collection of causes, these strings are combined in the evaluation order meaning. For a graph, the explanation is constructed based on the graph path taken during the reasoning.

Graph path refers to the actual pathway taken through a hypergraph model. While deep causality supports standard algorithms such as the shortest path, it also supports reasoning over a custom path in the sense that you can define the start and end node of a specific sub-graph. For even more comprehensive control, you can also retrieve nodes separately and reason over each one individually.

As a result, one can see precisely the complete line of reasoning with the actual evaluation at each stage. While this does not explain why something unexpected happened, it at least points to exactly when and where it happened and the actual evaluation at hand. That is already a solid starting point when it comes to identifying the relevant data, and figuring out why these were deviating.

Causal State Machines

Conventionally, causal models are seen as separated from the subsequent intervention mainly for flexibility reasons. The model-action separation remains valid in many use cases, but there is also a group of use cases where this separation is undesirable. Specifically, dynamic control systems require a fixed link to subsequent actions to preserve deterministic execution. Conventionally, a control system can often be expressed through a finite state machine because the number of all possible system states is known upfront.

DeepCausality comes with a causal state machine that defines states as causes and links each cause to a specific action. Since both the cause and the action are expressed as regular Rust functions that are stored as function pointers, virtually any complex action can be expressed through a causal state machine. Unlike a finite state machine, a causal state machine can be fully generated at runtime by adding or removing causal states dynamically. Therefore, it is not necessary to know possible system states when designing a causal state machine. Instead, when a system that requires supervision comes online, the exact states can be provided through the system metadata. This configures the causal state machine dynamically, which then assumes automated supervision of the system that came online. This process equates to the notion of a dynamic control system because the control system is configured dynamically and examines system control dynamically for as long as the originating system operates.

What can you do with DeepCausality?

There are several categories of applications for which conventional deep learning remains unsuitable but where DeepCausality may offer future directions. First, the advent of drones has led to an explosion of various monitoring and surveillance solutions across multiple industries. Conventional deep learning may not suit these multidimensional data streams from drones very well, because of the lack of contextualization that would give more meaning to the data and inform decisions. DeepCausality, on the other hand, provides a new method of streaming multiple complex data feeds into a context that serves as a single source of truth to various models, regardless of whether these are deep learning or deep causality models.

Financial markets are full of scenarios in which conventional deep learning falls short because of its inability to capture causal relations across temporal-spatial relations in time series data. DeepCausality has been designed from the ground up to tackle these problems and allows the formation of an instrument-specific context, updated in real-time, to inform one or more models that relate current data to its context to inform trade decisions. Because of the flexibility in designing a context, temporal and spatial patterns can be expressed and tested in real-time, thus significantly reducing the complexity and maintainability of financial models.

Cloud-native applications that require a significant number of dynamic system configurations and monitoring may benefit from simplifying dynamic control systems via causal state machines. This is of particular interest for application service providers that customize cloud solutions for clients.

Industries subject to safety regulations, such as transportation, avionics, or defense, might see DeepCausality as a viable alternative in areas where non-deterministic deep learning cannot be deployed for regulatory or safety reasons. Furthermore, industry monitoring solutions may benefit from the simple and robust design causal state machines provide in terms of the ease of adding new sensors dynamically.

Start-ups aiming to disrupt existing industries may explore any combination of deep learning and DeepCausality to gain a competitive edge over existing solutions in their industries. From a technical perspective, combining deep causality with deep learning models via a shared context is now possible and with these new possibilities, new opportunities will emerge.

About

Marvin Hansen is the director of Emet-Labs, a FinTech research boutique specializing in applying computational causality to financial markets.

DeepCausality is a hyper-geometric computational causality library that enables fast and deterministic context-aware causal reasoning in Rust. Please give us a star on GitHub.