An Event-Driven Book of Reference to facilitate Modernization

Shahir A. Daya
8 min readJun 7, 2021

By Shahir A. Daya, IBM Distinguished Engineer and CTO Financial Services and Mehryar Maalem, Software Engineer at IBM

Mehryar and I work on large digital transformation programs. These are usually well-established organizations, predating the internet and digital commerce with architecture that can be described as the sum of siloed systems, mimicking their organization’s communication structure [1] that are weaved together through disjointed point-to-point integration patterns. These environments pose a challenge for modernization efforts.

Over the past few years, our team has been on the ground engaging with various clients to address this challenge. In this blog post, we would like to share a pattern that has proved successful in practice. It is centred on building an Event-Driven Book of Reference with desirable properties that facilitate modernization at scale. We start by summarizing the motivating factors behind modernization and their challenges. Then we spend the rest of the blog post describing the Event-Driven Book of Reference, how it facilitates modernization, and some lessons learned from implementing this pattern at scale.

What are our client goals and challenges

Accelerating Innovation

The fundamental motivation behind modernization is to accelerate innovation. Our clients are looking for a nimble architecture that enables the rapid development of new experiences and applications while ensuring that existing core systems are not adversely impacted. This goal is usually inhibited by an existing architecture that suffers from data silos, complex interdependencies, and an abundant number of fragile core systems that may not scale out for these use cases. We see three focus areas for innovation:

  • Experience — Thin vertical slice to enable an experience that wasn’t possible before
  • Data — Data modernization, to bring data together from multiple sources into a modern common platform
  • Integration — Integration simplification, to move away from high cost and complexity legacy integration platforms

Application Modernization

New use cases require new technical capability and new technical capability requires new software. Yet, the existing core systems are still a critical part of the business, making a complete replacement both costly and high risk. As an alternative, the Strangler Pattern is usually recommended to modernize existing systems by upgrading a vertical slice of capability at a time. This means that there will be co-existence between the new and the old until a total system replacement, which usually means further integration complexity. Applications dependent on the systems must now manage orchestrating across both the new and the old until a replacement is viable. This tight coupling is a consistent headache with our clients who are looking to embark on modernization. In most cases, we want to modernize the application incrementally. However, there are cases where we want to maybe replace a homegrown custom-built system with a Commercial Off-the-Shelf (COTS) product or SaaS offering as well.

What is an Event Driven Book of Reference?

The Event-Driven Book of Reference is an event-driven data platform that ingests data from all systems of record and translates them into industry-aligned data models that can be used for consumption. More technically, it is a warm read-only replica of all business events across all systems of record and serves as the Strangler Facade in the Strangler Pattern[2]. This allows decoupling writes and reads, with the added bonus of abstracting the detailed data models of systems of record by introducing an industry-aligned data model for consumption. This will then enforce the adoption of the CQRS [3] pattern widely across the organization.

A caveat worth mentioning about this architecture is that the Book of Reference is eventually consistent [4]. This means that for use cases where an inconsistent read can have monetary implications, coupling reads and writes is essential. For example, a system that checks for account limits before authorization a purchase should not be based on an eventually consistent system. But most customer-facing use cases, such as online banking, can be well suited for this architecture. In our experience, most use cases do not require strong consistency, and the Book of Reference can be widely adopted as the place for reads.

Architecture Overview

The Event-Driven Book of reference consists of the following components: For data storage, it consists of a landing zone that stores the raw events ingested from the Systems of Record directly and a curated zone that holds the industry-aligned representation of the ingested events. All data is stored in an append-only, persistent, and immutable log. The data in the ingested zone is ephemeral, and the persistence model is usually driven to allow for some cushion to replay events in case of failure. However, the retention policy for the curated zone is generally driven by the business requirements around the business events that it represents. Often, it follows the same retention policy as the underlying system of record that sources the data. In cases where the industry-aligned data model is an aggregation across multiple systems of record, maximum retention is used.

At times, batch jobs are used to hydrate the curated zone with historical data. Otherwise, the data feeding into the Book of Reference should be in real-time. This means that all state changes are broadcasted to the ingested layer in real-time, requiring the data pipelines to be streaming applications. However, it is worth noting that some back-end processes are inevitably part of scheduled batches. For example, nightly posted transaction processing at banks. This means that it is often required to adopt a batch data processing platform as well. Preferably the same one that is used for the historical batch jobs. Read access to the curated zone can either be through direct access to the data or APIs built on materialized views [5]. An In-Memory Data Grid [6] can be used for the materialized views in use cases where speed and scale are non-negotiable.

In one of our implementations, Apache Kafka was used as the Book of Reference. Kafka is our preferred choice. We integrated with the systems of record through Kafka Connect, wrote Kafka Streams applications to transform the feeds to industry-aligned models stored in topics in a curated zone. For all batch jobs, we utilized Informatica Power Center. Then, we gave either direct access to the topics for notification use cases or built APIs for more complex read patterns, such as historical queries. Since this client was in the Financial Services Sector, we adopted BIAN (Banking Industry Architecture Network) [7] to guide the design of the industry-aligned data models.

The figure below shows the architecture for the Event-Driven Book of Reference.

Figure 1: The Event-Driven Book of Reference Architecture

A Vertical Slice

A vertical slice is the set of all applications that put together enable a specific business capability or a customer experience. For example, take the account summary page in online banking. A vertical slice would be all systems and applications that made that view for a customer possible, such as the system of record that stores balances, the back-end service that serves that data, and the micro-front end that visualizes the data in the web application.

In this architecture, the Systems of Record that are part of the vertical slice are ingested into the Event-Driven Book of References and materialized views are built on top of the topics in the curated zone. Figure 2 shows what a verticle slice throught the architecture looks like.

Figure 2: A Vertical Slice through the architecture

How does the Book of Reference facilitates Modernization

Industry-aligned data models decouple consumers from the underlying implementation of existing and new core systems. This decoupling means that the new and the old system can co-exist and then swap when the new has proved to meet all the old requirements. In a perfect implementation, system modernization is transparent to the consuming applications. Using this pattern, one of our clients managed to modernize critical core systems while also building a flagship customer-facing product. Our client was able to ship this new experience to both new and legacy customers. During the building of this customer experience, the technical teams were transparent to the modernization efforts and only interfaced with the Event-Driven Book of Reference throughout development. We found that by abstracting the systems of record completely from the application teams, the pace of development measured by the time from ideation to shipping to production was decreased by approximately 16 months for our client!

Lessons Learned

Here are some lessons learned that we would like to share from implementing this pattern at scale:

  • Most legacy systems do not have instrumentation built in to feed into an event stream. Their existing data models do not lend themselves well to an event-driven and industry-aligned data model. This requires complex applications to be built for the translation component. We recommend pushing as much of the translation to the underlying System of Record as possible to minimize the complexity of the data pipelines.
  • Standardized industry-aligned data models are challenging to build. We recommend using pre-existing standards if your industry has one. For example, for some of our financial services clients, using BIAN has proved to be successful.
  • Since the Book of Reference unifies all systems of record, global keys are necessary to uniquely identify entities that span Systems of Record. For example, we had to deal with account numbers that were not unique across regions in one case. This worked in the existing system since different regions were handled differently. In our case, we introduced globally unique identifiers, a first at our client, in the curated zone. We found that since the existing architectures usually did not necessitate such a key, getting one is hard to come by.

In Summary

To conclude, an Event-Driven Book of Reference facilitates modernization. It serves as the facade of the Strangler Pattern to allow for the decoupling of consuming applications and the Systems of Record that provides the co-existence layer that later enables full core replacement. It also provides a foundation for adopting the CQRS pattern, where reads are based on industry-aligned data models and writes based on the Systems of Record. This decoupling provides a lasting foundation for future growth. We have implemented this at various clients, and it has proven to be a successful pattern for modernization.

In subsequent blogs, Mehryar and I, along with other team members, will cover various aspects of the Event-Driven Book of Reference architecture, including:

  • Enabling self-service of the data platform using a Metadata Repository for Data Discovery and Lineage
  • Monitoring latency and detecting data loss through the data supply chain
  • High Availability and Disaster Recovery
  • Among other topics

We hope you found this article to be helpful. We are always learning, so if you have any suggestions or experiences with similar approaches, please share them in the comments section. We love to hear and learn from others. If you have suggestions for what particular aspect of this pattern you would like us to go into next, please put that in the comments. And please don’t forget to follow and clap!

References

[1] ”Conway’s law — Wikipedia”, En.wikipedia.org, 2021. [Online]. Available: https://en.wikipedia.org/wiki/Conway%27s_law. [Accessed: 07- Jun- 2021].

[2] ”Strangler Fig pattern — Cloud Design Patterns”, Docs.microsoft.com, 2021. [Online]. Available: https://docs.microsoft.com/en-us/azure/architecture/patterns/strangler-fig. [Accessed: 07- Jun- 2021].

[3] M. Fowler, “CQRS”, martinfowler.com, 2021. [Online]. Available: https://martinfowler.com/bliki/CQRS.html. [Accessed: 07- Jun- 2021].

[4] ”Eventual consistency — Wikipedia”, En.wikipedia.org, 2021. [Online]. Available: https://en.wikipedia.org/wiki/Eventual_consistency. [Accessed: 07- Jun- 2021].

[5] ”Materialized View pattern — Cloud Design Patterns”, Docs.microsoft.com, 2021. [Online]. Available: https://docs.microsoft.com/en-us/azure/architecture/patterns/materialized-view. [Accessed: 07- Jun- 2021].

[6]”In-Memory Data Grid — Apache Ignite”, Ignite.apache.org, 2021. [Online]. Available: https://ignite.apache.org/use-cases/in-memory-data-grid.html. [Accessed: 07- Jun- 2021].

[7] ”Home — BIAN”, BIAN, 2021. [Online]. Available: https://www.bian.org. [Accessed: 07- Jun- 2021].

--

--

Shahir A. Daya

Shahir Daya is CTO at Zafin and Former IBM Distinguished Engineer.