Ataei P
Ataei P
D OCTOR OF P HILOSOPHY
Supervisor
Dr Alan Litchfield
Dr Stephen Thorpe
By
Pouya Ataei
School of Engineering, Computer and Mathematical Sciences
Abstract
Today, people are the ceaseless generators of structured, semi-structured, and unstruc-
tured data that, if gleaned and processed, can reveal game-changing patterns. Addition-
ally, advancements in technology have made it easier and faster to collect and analyse
this data. This has led to the age of big data. The age of big data began when the
volume, variety, and velocity of data overwhelmed traditional systems.
Many businesses have attempted to harness the power of big data; nevertheless, the
success rate is low. According to multiple surveys, only a 20% of big data projects are
successful. This is due to the challenges of adopting big data, such as organisational
culture, rapid technological change, system complexity, and data architecture. This
thesis aims to address data architecture challenges of adopting big data by introducing a
domain-driven, decentralised big data reference architecture.
This reference architecture is designed specifically to mitigate big data challenges
by providing a scalable data architecture for big data systems, flexible and rapid data
processing for varied velocity, adaptable management for a wide variety of data formats,
maintainable approach for data discovery and aggregation, and increased attention
to cross-cutting concerns such as metadata, privacy and security. This research uses
design science research as the underlying research framework while utilising empirically
grounded reference architecture guidelines for the development of the artefact. The
evaluation of the artefact involves two distinct methods: a case-mechanism experiment
and expert opinion, ensuring a comprehensive assessment of the big data reference
2
architecture.
The reference architecture’s usefulness and effectiveness are supported by this
process, which shows that it can handle volume, velocity, and variety of big data
by processing data quickly, being scalable, and being able to adapt to different data
formats. Additionally, the reference architecture’s design mitigates the complexity of
monolithic data pipelines, decentralises data ownership to avoid bottlenecks, and fosters
a more integrated, agile approach to big data systems. This study positions itself as
a progressive step in big data reference architectures, directly targeting and offering
solutions to the existing shortcomings of big data architectures. It is aimed primarily at
data architects and researchers seeking innovative approaches in big data system design
and development, as well as practitioners looking to understand and apply the latest
advancements in big data architectures.
3
Contents
Abstract 2
Attestation of Authorship 12
Publications 13
Acknowledgements 14
1 Introduction 15
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2 Context and Significance . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.1 What is Big Data? . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.2 The Value of Big Data . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.3 Reference Architectures . . . . . . . . . . . . . . . . . . . . . . 19
1.2.4 Significance of Reference Architectures . . . . . . . . . . . . . 20
1.2.5 Insights from Current Big Data Reference Architectures . . . . 21
1.2.6 Reference Architecture’s Role in Addressing Big Data Challenges 22
1.2.7 Microservices and Service Distribution Patterns . . . . . . . . 23
1.3 Overview of the Research . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3.1 Big Data, Big Bang? . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3.2 The Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.3.3 The Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Motivation, Methods, and Philosophy . . . . . . . . . . . . . . . . . . . 28
1.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4.2 Methods and Philosophy . . . . . . . . . . . . . . . . . . . . . . 29
1.5 Emergence of Research Gaps and Direction of Study . . . . . . . . . . 31
1.6 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.7 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2 Research Methodology 36
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 What Makes a Good Information Systems Research? . . . . . . . . . . 38
2.3 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4
2.4 The Selection of the Research Approach . . . . . . . . . . . . . . . . . 40
2.5 The Underlying Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.6 Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.7 Design Science Research . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.8 Research Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.9 Design Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.9.1 Step 1: Stakeholder and Goal Analysis: . . . . . . . . . . . . . 52
2.9.2 Step 2: Problem Investigation . . . . . . . . . . . . . . . . . . . 55
2.9.3 Step 3: Requirement Specification . . . . . . . . . . . . . . . . 57
2.9.4 Step 4: Treatment Design . . . . . . . . . . . . . . . . . . . . . 61
2.9.5 Step 5: Treatment Validation . . . . . . . . . . . . . . . . . . . 69
2.9.6 Step 6: Treatment Implementation and Evaluation . . . . . . . 78
2.10 Iterations and Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5
4.5.3 Data Extraction and Synthesis . . . . . . . . . . . . . . . . . . . 125
4.6 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.6.1 What are the Fundamental Concepts of RAs? . . . . . . . . . . 128
4.6.2 How can RAs Help BD System Development? . . . . . . . . . 130
4.6.3 What are Some Common Approaches to Creating BD RAs? . 132
4.6.4 Challenges of Creating BD RAs . . . . . . . . . . . . . . . . . 134
4.6.5 What are Current BD RAs? . . . . . . . . . . . . . . . . . . . . 135
4.6.6 Major Architectural Components of BD RAs . . . . . . . . . . 139
4.6.7 What are the Limitations of Current BD RAs? . . . . . . . . . 146
4.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.8 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6
6.4.14 Architectural Styles . . . . . . . . . . . . . . . . . . . . . . . . . 245
6.4.15 Architectural Characteristics . . . . . . . . . . . . . . . . . . . . 246
6.5 Artefact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
6.5.1 Metamycelium . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
9 Discussion 347
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
9.2 Key Insights and Observations . . . . . . . . . . . . . . . . . . . . . . . 349
9.3 Analysis and Interpretation of Findings . . . . . . . . . . . . . . . . . . 355
9.3.1 Contextualising the Findings: . . . . . . . . . . . . . . . . . . . 356
9.3.2 Comparative Analysis: . . . . . . . . . . . . . . . . . . . . . . . 358
9.4 Implications for Practice and Fields of Research . . . . . . . . . . . . . 360
9.4.1 Implications for Practice . . . . . . . . . . . . . . . . . . . . . . 361
9.4.2 Implications for Fields of Research . . . . . . . . . . . . . . . . 363
9.5 Generalisability and Transferability . . . . . . . . . . . . . . . . . . . . 365
7
9.5.1 Public Availability of Research Artefacts . . . . . . . . . . . . 366
9.5.2 Published Validation . . . . . . . . . . . . . . . . . . . . . . . . 366
9.5.3 Note on Research Scope . . . . . . . . . . . . . . . . . . . . . . 367
9.6 Research Process and Insights . . . . . . . . . . . . . . . . . . . . . . . 368
9.6.1 Navigating Technical Complexities and Communication . . . 369
9.6.2 Academic Reception and Industry Perspective . . . . . . . . . 369
9.6.3 Industry Perception of RAs . . . . . . . . . . . . . . . . . . . . 370
9.6.4 Resource Constraints . . . . . . . . . . . . . . . . . . . . . . . . 371
9.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
10 Conclusion 373
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
10.2 Recapitulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
10.2.1 The Importance and Challenges of Big Data . . . . . . . . . . . 375
10.2.2 Research Questions: Addressing BD Project Failures . . . . . 376
10.3 The Big Data Landscape . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
10.4 Metamycelium: A Solution in Context . . . . . . . . . . . . . . . . . . 378
10.5 Unique Contributions to the Field of Research . . . . . . . . . . . . . . 379
10.6 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
10.7 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
10.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
References 385
Appendices 410
8
List of Tables
9
List of Figures
10
7.1 Metamycelium Prototyping Phases . . . . . . . . . . . . . . . . . . . . . 279
7.2 Metamycelium Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . 280
7.3 Open Policy Agent Communication Flow with Other Services . . . . . 283
7.4 Data Lichen Dashboard in Local Development Environment . . . . . . 290
7.5 Intra-domain communication in Metamycelium’s prototype . . . . . . 291
7.6 Overview of kafka topics . . . . . . . . . . . . . . . . . . . . . . . . . . 292
7.7 Scenario S-1 Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 292
7.8 Open Telemetry Topic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
7.9 Open Telemetry Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
7.10 Open Telemetry Trace Span . . . . . . . . . . . . . . . . . . . . . . . . . 294
7.11 Kafka Data Ingestion in Bytes in Customer Domain’s Analytical Service297
7.12 Total Data Ingested in Bytes in Customer Domains’ Analytical Service 297
7.13 Total Data Ingested in Bytes in Weather Domain’s Analytical Service 297
7.14 Customer and Weather Domain’s Analytical Service Latencies . . . . 299
7.15 Open Telemetry Service CPU Usage . . . . . . . . . . . . . . . . . . . . 301
7.16 Kubernetes Cluster CPU Usage . . . . . . . . . . . . . . . . . . . . . . . 302
7.17 Kubernetes Cluster Read and Write Statistics . . . . . . . . . . . . . . . 303
7.18 Kubernetes Cluster Network Statistics . . . . . . . . . . . . . . . . . . . 303
7.19 Memory Usage for Various Services in the Cluster . . . . . . . . . . . 305
7.20 Number of Errors in Telemetry Processing Service . . . . . . . . . . . 307
7.21 Customer Domain Streaming Topic Statistics . . . . . . . . . . . . . . . 308
7.22 Memory Utilisation in Customer Domain in High Velocity Case . . . . 309
7.23 Memory Utilisation in Weather Domain in High Velocity Case . . . . 310
7.24 Kafka Ingestion Latency in Streaming Case . . . . . . . . . . . . . . . 310
7.25 Processing Duration in Streaming Case in the Weather Domain . . . . 311
7.26 Data Scientist Flow with Authentication . . . . . . . . . . . . . . . . . 313
7.27 Data Scientist Secret Retrieval Delay in Seconds . . . . . . . . . . . . 314
7.28 CPU Utilisation in Data Scientist Application . . . . . . . . . . . . . . 314
7.29 Memory Utilisation in Data Scientist Application . . . . . . . . . . . . 315
7.30 Query Processing Duration in Data Science Application . . . . . . . . 315
11
Attestation of Authorship
Signature of candidate
12
Publications
13
Acknowledgements
I would like to express my sincere gratitude to Dr. Alan Litchfield for his invaluable
guidance and supervision throughout this research. His insights have been fundamental
in shaping the direction of this study.
I am grateful to the experts who participated in the study and to all staff members at
Auckland University of Technology (AUT) involved with PhD students, whose support
and wisdom have been indispensable.
I extend my appreciation to fellow researchers, including Daniel Staegemann, for their
collaboration and insights which significantly contributed to my academic development.
Additionally, I acknowledge the constructive feedback received from various journals
and conferences, particularly on papers that were not accepted, as this feedback was
crucial in refining my research.
Lastly I would like to thank Dr. Stephen Thorpe for reviewing parts of this thesis and
providing valuable feedback.
14
Chapter 1
Introduction
1.2 Context and Significance 1.2.4 Significance of Reference Architectures Links to Chapter 4 (Big Data Reference Architectures)
1.3 Overview of the Research 1.2.5 Insights from Current Reference Architectures Links to Chapter 4 (Big Data Reference Architectures)
1.2.6 Reference Architecture's Role in Addressing Big Data Challenges Links to Chapter 4 (Big Data Reference Architectures)
Chapter 1: Introduction
1.4.1 Motivation
1.5 Emergence of Research Gaps and Direction of Study Links to Chapter 4 (Big Data Reference Architectures)
Links to Chapter 2 to 10
15
Chapter 1. Introduction 16
1.1 Introduction
This study is situated within the domain of Information Systems Research, specifically
addressing the architectural challenges in BigData (BD) systems. The research problem
centers on the limitations of current BD Reference Architectures (RAs), particularly
their struggles with system evolution, data governance, and architectural adaptability in
rapidly changing technological landscapes. Through the application of Design Science
Research (DSR), this study develops and evaluates Metamycelium, a novel domain-
driven distributed RA that addresses the scalability, maintainability, and adaptability of
BD systems while providing data governance frameworks. The research contributes both
to the theoretical understanding of BD architectures and to practical implementation
approaches in complex data environments.
This introductory chapter lays the foundation for the thesis, which addresses the
development of a novel RA, Metamycelium, for BD systems. Metamycelium presents a
means for capturing best practices and expressing them through architectural constructs.
This artefact provides a solution to the limitations of current BD RAs, such as maintain-
ability, scalability, data quality, and cross-cutting concerns, through the application of
DSR methodology.
The chapter begins by providing essential context in Section 1.2, defining BD, and
explaining its value across various sectors. It highlights how BD is conceptualised in
this research and discusses the tangible benefits and applications of BD in different
fields, including healthcare, energy exploration, and entertainment. The chapter also
introduces the concept of RAs and their significance in the IT field, particularly in the
development of complex systems like BD environments. It then explores the concepts of
microservices and decentralised, distributed architectures, underscoring their relevance
and benefits.
Following this, Section 1.3 presents an overview of the research, situating the study
Chapter 1. Introduction 17
within the context of the rapidly evolving BD landscape and the proliferation of BD
technologies. It emphasises the need for advanced data management and processing
systems in the BD era.
Section 1.4 then delves into the motivation, methods, and philosophy underpinning
the research. This section examines the critical challenges in current BD architectures,
including data ownership conflicts, scalability limitations, and integration complexities.
It explores how these challenges impact system effectiveness and organizational success.
The section then outlines the research methods employed, including systematic litera-
ture reviews (SLRs) and a narrative literature review, and discusses the philosophical
approach that guides our investigation of these challenges.
Finally, the chapter concludes by presenting the primary research questions in
Section 1.6 that were instigated in this thesis. These questions aim to address critical
challenges in the field of BD systems and their architectural designs, setting the stage
for the exploration and analysis in the subsequent chapters.
This section provides foundational definitions essential for comprehending the nuances
of the research. This section provides the conceptual framework necessary to understand
the terminology used in the thesis.
To define BD within the scope of this research, various definitions from the body of
knowledge have been examined. Kaisler, Armour, Espinosa and Money (2013) defines
BD as “the amount of data that is beyond technology’s capability to store, manage,
and process efficiently”. Srivastava (2018) state that BD pertains to “the use of large
Chapter 1. Introduction 18
data sets to handle the collection or reporting of data that serves various recipients in
decision making”.
Sagiroglu and Sinanc (2013) describe BD as “a term for massive data sets having a
large, more varied, and complex structure with the difficulties of storing, analysing, and
visualising for further processes or results”.
Drawing from these definitions, BD in this research is conceptualised as “datasets
characterized by high volume, velocity, and variety, which require advanced technolo-
gies and analytical methods for their transformation into value. These datasets typically
exceed the processing capabilities of conventional relational database management
systems (RDBMS) for managing and analyzing data. Big Data is further defined by
its veracity, or the level of reliability and accuracy of the data, which poses significant
challenges in data processing and analysis”.
The significance and value derived from BD remain pronounced (Ataei & Litchfield,
2022). Extensive discussions on the concept permeate reports, statistics, research, and
conferences (H. Chen, Chiang & Storey, 2012). Notably, prominent companies like
Google, Facebook, Netflix, and Amazon have propelled this momentum with substantial
investments in BD initiatives (Rada 2017). A compelling illustration of the tangible
benefits that BD offers can be seen in the Netflix Prise recommender system. This
system capitalised on a diverse array of data sources, including user queries, ratings,
search terms, and various demographic indicators (Amatriain, 2013). By implementing
BD-powered recommendation algorithms, Netflix not only achieved a considerable
increase in TV series consumption but also observed certain series experiencing up to a
fourfold surge in viewership (Amatriain, 2013).
In a healthcare context, the Taiwanese government adeptly merged its national
Chapter 1. Introduction 19
components and their interactions are important aspects, RAs primarily enable archi-
tects to make decisions based on stakeholder needs, technical constraints, and quality
requirements. This approach ensures architectural decisions align with stakeholder re-
quirements and business objectives, while providing a framework for evaluating design
trade-offs. This clarity fosters the creation of manageable modules, each addressing dis-
tinct aspects of complex problems, and provides a high-level platform for stakeholders
to engage, contribute, and collaborate (Nakagawa, Oquendo & Maldonado, 2014).
Given the lack of a standard definition for “reference architecture", the definition
from Clements, Garlan, Little, Nord and Stafford (2003), is taken, which describes it
as a pre-established framework (an abstract blueprint) for designing software within a
specific field. This framework, comprising structures, elements, and their relationships,
serves as a template for creating concrete architectures (Pourmirza, Peters, Dijkman &
Grefen, 2017).
qualities and higher levels of abstraction. They aim to capture the essence of practice and
integrate well-established patterns into cohesive frameworks, encompassing elements,
properties, and interrelationships.
The significance of RAs in BD is multifaceted, encompassing aspects like com-
munication, complexity control, knowledge management, risk mitigation, fostering
future architectural visions, defining common ground, enhancing understanding of BD
systems, and facilitating further analysis.
The systematic analysis in Chapter 4 reveals insights about current BD RAs and their
implementation challenges. The SLR, examining 79 studies from both academic and
industrial sources, identifies several key findings regarding the current state of BD RAs.
Current BD implementations face significant architectural challenges. Many or-
ganizations develop BD systems using ad-hoc architectural approaches that diverge
from established software engineering practices (Gorton & Klein, 2015). This leads
to suboptimal architectural decisions and creates difficulties in system evolution. The
SLR reveals that while RAs exist in the BD domain, they often mirror traditional data
warehousing approaches, lacking the flexibility and scalability required for modern BD
systems.
The analysis identifies several important limitations in current BD RAs. Notably,
the NIST BD RA and Lambda architecture (S1), while providing comprehensive frame-
works, show deficiencies in addressing essential elements such as metadata management,
data quality controls, and privacy considerations. These limitations are particularly
evident in areas of data governance, quality management, and system scalability.
The findings highlight a gap between theoretical architectural models and practical
implementation needs. While BD RAs aim to provide standardized approaches, they
Chapter 1. Introduction 22
often lack concrete guidance for addressing common implementation challenges such
as data quality management, system scalability, and cross-team collaboration. This
gap is particularly evident in the context of evolving BD technologies and changing
business requirements.
These insights from the SLR underscore the need for more mature and practical BD
RAs that can effectively address current implementation challenges while providing flex-
ible frameworks for future evolution. The findings suggest that future research should
focus on developing RAs that better integrate modern architectural patterns, particularly
in areas of microservices, event-driven architectures, and metadata management.
lenges
While recognising that RAs are not the exclusive solution for tackling BD challenges,
they represent an effective approach due to several inherent advantages. It is important
to note that alternative methods, such as bespoke system development and modular
architecture approaches, also hold merit in certain contexts. However, the specific
strengths of RAs in addressing BD challenges include:
4. Incorporating Robust Security and Privacy Protocols: RAs can embed security
and privacy considerations into the architecture from the outset. While other
methods can also integrate these aspects, RAs provide a comprehensive and
uniform approach to ensure system reliability and address vulnerabilities.
Considering the range of artefacts available for tackling BD challenges, RAs stand
out for their capacity to provide a comprehensive, standardised, and scalable frame-
work. Their suitability for the complexities and diverse demands of BD systems is
evident (Cloutier et al., 2010b). Nonetheless, the decision to utilise an RA should be
guided by the specific requirements of the project, the organisational context, and the
characteristics of the data at hand.
The architectural decisions around service distribution are nuanced and context-
dependent, requiring careful consideration of when to centralize or distribute compo-
nents, and to what degree. This architectural style, as highlighted by Richards (2015),
requires balancing various factors including data consistency, network latency, and
system reliability.
In modern systems, data and processing distribution decisions are made based on
specific requirements and constraints, as discussed by Coulouris, Dollimore and Kind-
berg (2005). These decisions introduce trade-offs around the CAP theorem: systems
must balance consistency, availability, and partition tolerance based on their specific
needs and use cases.
The convergence of microservices within these architectures represents both an
opportunity and a challenge in software engineering. While it enables systems that are
modular and adaptable, it also requires careful consideration of service boundaries, inter-
service communication patterns, and failure handling mechanisms. Organizations must
evaluate whether the benefits of specific distribution patterns outweigh their operational
overhead for each component and use case.
A key consideration in these architectures is their scalability. In the context of
modern systems, scalability refers to a system’s ability to handle increased load. There
are two primary approaches to scaling, each appropriate in different contexts:
• Horizontal Scaling: Also known as "scaling out", this involves adding more
machines to a resource pool. In microservices, this often means deploying more
instances of a service. While this approach provides better fault tolerance and
can handle larger loads, it introduces complexity in load balancing and data
consistency.
• Vertical Scaling: Also known as "scaling up", this involves adding more power
Chapter 1. Introduction 25
(CPU, RAM) to an existing machine. For microservices, this might mean in-
creasing the resources allocated to a service instance. This approach is simpler to
manage but has physical limitations and potential single points of failure.
Microservices architecture facilitates both types of scaling, but teams must carefully
consider their specific requirements, operational capabilities, and resource constraints
when choosing between these approaches. The decision often involves balancing factors
such as cost, complexity, reliability, and performance requirements. The key is not
choosing between centralization and distribution as absolute approaches, but rather
understanding when and why to apply each pattern based on concrete requirements and
constraints.
This section entails a brief history of BD, its adoption, the challenges, the solutions
to those challenges, and the contribution of this thesis. This section is important as it
instills a general understand of current state of BD, and the elements surrounding it.
The growth of the internet and digital technologies has significantly increased global
connectivity. The widespread adoption of smartphones, IoT devices, and high-speed
networks has led to a substantial increase in data generation and transmission. Modern
network infrastructures now commonly support data speeds of 1 Gbps or higher, while
mobile devices and IoT sensors continuously collect and transmit data. This technologi-
cal landscape has made data increasingly prevalent in various aspects of daily life and
business operations. This ubiquity of data acts as a fundamental element, laying the
groundwork for the intricate interplay between advanced technology and architectural
Chapter 1. Introduction 26
Driven by the ambition to harness the power of this large amount of data, the term
‘Big Data’ was coined (Lycett, 2013). BD initially emerged to address the challenges
associated with various characteristics of data, such as velocity, variety, volume, and
variability (Rada, Ataeib, Khakbizc & Akbarzadehd, 2017a). BD engineering is the
practice of extracting patterns, theories, and predictions from a large set of structured,
semi-structured, and unstructured data for the purposes of business competitive advan-
tage (Rada, Ataeib, Khakbizc & Akbarzadehd, 2017b; Huberty, 2015). BD represents
an important innovation, marking the beginning of a new era in the data-oriented indus-
try. However, BD is not a solution that can automatically resolve any business process
challenge. While BD offers numerous opportunities, integrating such a complex and
influential technology into the existing frameworks of organisations is a challenging
task. While a lot of opportunities exist in BD, subsuming an emergent and rather
high-impacting technology like BD into the current state of affairs in organisations is a
daunting task.
According to a recent survey from technology review insights in partnership with
Databricks (2021), only 13% of the organisations excel at delivering on their data
strategy. Another survey by NewVantage (2021) indicated that only 24% of organi-
sations have successfully gone data-driven. This survey also states that only 30% of
Chapter 1. Introduction 27
Since the approach of ad-hoc design to BD system development may not provide
the most impactful results, novel data architectures that are designed specifically for
BD are required. To contribute to this goal, the notion of RAs is explored, and a
distributed domain-driven software RA for BD systems is presented. This RA is called
Chapter 1. Introduction 28
Metamycelium. The main contributions of this study are threefold: 1) explicating design
theories underlying current BD systems and their limitations; 2) explicating design
theories that generate the artefact’s constructs; and 3) the artefact itself. A detailed
description of unique contributions of this study is depicted in Chapter 10, Section 10.5.
This section presents the motivations driving the research, the methods employed to
navigate the research, and the philosophical underpinnings guiding the approach.
1.4.1 Motivation
The motivation for this research is derived from a critical examination of prevailing BD
RAs, as detailed in a SLR (Chapter 4). This examination reveals a significant reliance
on monolithic data pipeline architectures in current BD RAs, presenting constraints
in scalability, adaptability to rapid technological changes, and effective integration
of diverse data sources. Furthermore, challenges such as centralised data ownership
models and inefficiencies in data management by specialised, yet isolated teams are
recurrently identified.
In response, this study introduces Metamycelium, a proposed RA that shifts towards
a decentralised, domain-driven approach. This architectural shift aims to enhance
scalability and adaptability, contrasting with the limitations of traditional BD RAs.
Metamycelium also focuses on implementing efficient data management practices and
reinforcing security and privacy measures. The motivation behind this research is to
propose a more dynamic and responsive RA for BD systems, contributing a novel
perspective to the BD RA field and addressing some of the key gaps identified, thereby
laying the groundwork for future advancements in this area.
Chapter 1. Introduction 29
Within the DSR framework, defining key elements like research approach, under-
lying philosophy, research design, and research methods is essential. This ensures a
systematic transition from a broad problem in BD to detailed methodologies. The cho-
sen research approach is qualitative, given its appropriateness for exploring emerging
phenomena in the context of a novel domain-driven distributed architecture for BD
systems. This approach enables inductive building from data, moving from particulars
to general themes, and interpreting the data’s relevance in the context of BD system
development.
This DSR-driven research design encompasses several key stages:
Each stage is important for ensuring the artefact’s alignment with practical needs in
BD systems, its effectiveness, and its knowledge contribution.
Chapter 1. Introduction 31
In the problem investigation stage, SLRs and narrative literature review methods
capture the current state of BD RAs, identifying existing gaps and trends. Treatment
design integrates empirical data into the architectural design, ensuring that the RA is
both theoretically robust and resonant with real-world BD practices and challenges. Sub-
sequent stages, namely treatment implementation, validation, and evaluation, involve
developing and evaluating a RA prototype through methods like expert opinions and
case mechanism experiments. These methods emphasise the RA’s practical relevance
and applicability in the dynamic field of BD.
Adhering to the DSR framework, underpinned by a pragmatic philosophy, the study
demonstrates a commitment to blending academic rigour with practical utility in BD
systems development. It aims to bridge theoretical knowledge and practical application
in BD, focusing on innovative design and effective evaluation to address practical
problems in the field.
The analysis of existing BD RAs (as presented in Chapter 4) illuminates several research
gaps, particularly around identified failure modes in current architectural designs:
Copy
To address these gaps, this research advocates the development of a new RA that
embraces decentralised and domain-driven principles, aiming to enhance robustness in
data management and adaptability.
The research questions for this thesis are informed by the failure modes and gaps
identified in existing BD RAs. These questions are designed to confront the challenges
in the field of BD systems and their architectural designs:
Chapter 1. Introduction 33
RQ1 What are the limitations of current big data reference architectures? This
question aims to identify and analyse the key limitations in existing BD RAs,
particularly in areas critical to their effectiveness, such as scalability, adaptability,
and maintainability.
This chapter aligns with the broader research objectives of understanding the
architectural complexities of BD systems.
• Design of the Artefact: The design and development of the artefact, Metamycelium,
are thoroughly discussed in Chapter 6. It covers the artefact’s requirements, the
underlying theories, and the development process.
• Evaluation of the Artefact - Part 1: The first part of the artefact’s evaluation, a
single-case mechanism experiment, is detailed in Chapter 7. This chapter outlines
the research design and presents the results of the prototype evaluation.
• Discussion: The Chapter 9 integrates the research findings into a broader dis-
cussion. It synthesises key insights, analyses the results, and discusses the
implications and generalisability of the study.
• Appendix A: Expert Opinion Guide: Detailed guide used for gathering expert
opinions, as referenced in Appendix A.
1.8 Conclusion
This chapter establishes the foundation for the research by outlining its objectives and
methodologies and framing the context for BD exploration. The following chapters will
delve into the research design, philosophies and methods that guide this study.
As the thesis progresses, each chapter builds upon this initial conceptual framework,
exploring the complexities and challenges of BD systems and their architectural solu-
tions. This chapter serves as a stepping stone into the exploration of BD, setting the
stage for analyses and discussions in the subsequent chapters.
Chapter 2
Research Methodology
2.1 Introduction
36
Chapter 2. Research Methodology 37
2.1 Introduction
Building upon the foundations set for the research in the previous chapter, wherein
motivations, objectives, and philosophies are discussed, this chapter delves into the
foundational principles that drive this research design and methodology.
All research is influenced by the assumptions made by the researcher about what
constitutes valid research and the methods deemed appropriate for the specific prob-
lem. To ensure the rigour and quality of research, it is therefore imperative to first
elucidate these underlying assumptions (Myers & Avison, 2002). In alignment with
this perspective, this chapter aims to elaborate on the principal components of the
current research: research approaches, philosophical worldviews, research designs, and
research methods. Moreover, the central research questions are restated and further
examined. Figure 2.1 portrays a high-level view of all the elements incorporated in the
research methodology of this study.
Subsequent sections discuss the essence of good Information Systems (IS) Research
(Section 2.4), illuminate the selection of the research approach, and shed light on the
multifaceted plans influenced by the researcher’s philosophical assumptions about the
subject (Section 2.5). This discourse leads to the intricacies of the research design,
exploring the architectural layout of the research (Section 2.6). The chapter concludes
by summarising the aforementioned discussions, laying the groundwork for subsequent
chapters.
Among the discernible trends in IS research, one often referred to as traditional has
its roots in the positivist quantitative approach of natural science. While this approach
is internally consistent and can yield significant results, there are instances where it
might not lead to practical or definitive research outcomes (D. R. Vogel & Wetherbe,
1984). Creswell and Creswell (2017) suggests that many of these studies rely heavily
on laboratory-based experiments or field surveys, prioritising statistical analysis. This
approach has several noted challenges:
2. When setting values for variables, there might be occasions where certain assump-
tions could misrepresent real-world complexities.
Chapter 2. Research Methodology 39
3. A focus is often not given to artefacts in these studies, possibly sidelining the
applied significance of IS and its business relevance.
Based on the findings from Chapter 3, 4, and 5, and the concepts discussed in Chapter 1
and in the previous section, the research questions formulated for this thesis are as
follows:
RQ1 What are the limitations of current big data reference architectures?
RQ2 How can a reference architecture be designed to mitigate or address these limita-
tions in big data projects?
Chapter 2. Research Methodology 40
Research approaches serve as procedures and plans for research, guiding the progression
from broad assumptions to detailed methods of data collection, analysis, and interpreta-
tion (Creswell & Creswell, 2017). Such plans are shaped by philosophical assumptions
regarding the subject matter, the chosen research design (the procedure of enquiry), and
specific methods for data collection, analysis, and interpretation. The research approach
derives from the research problem, the study’s audience, and the research questions.
For the extraction of information from a broad research problem into successive
methods, four critical elements need definition: research approach, underlying philoso-
phy, research design, and research methods.
According to Creswell and Creswell (2017), the three prevalent research approaches
include a) quantitative, b) qualitative, and c) mixed methods. The choice among them
depends on the philosophical assumptions (positivist, interpretive) held, the selected
research strategies (qualitative grounded theory, quantitative surveys), and the methods
chosen to execute these strategies (collecting data through observation for qualitative
research or using instruments for quantitative research, which can also include textual
data).
A frequent distinction between quantitative and qualitative research approaches
lies in data collection and presentation methods. While quantitative approaches often
employ numbers, they can also rely on textual data. On the other hand, qualitative
research typically seeks to describe existing phenomena, although it can also explore
emerging ones. Furthermore, quantitative research does not strictly rely on closed-ended
questions based on a priori knowledge and an existing theory; it can also encompass
open-ended inquiries. Mixed-method research combines elements of both qualitative
and quantitative research, merging the results to shed light on significant societal issues
(Myers & Avison, 2002; Sevilla, 1992).
Chapter 2. Research Methodology 41
The continuum of research approaches ranges from qualitative at one end to quanti-
tative at the other, with mixed methods occupying an intermediary position (I. Newman,
Benz & Ridenour, 1998). Historically, quantitative research has been dominant across
various disciplines, not just the social sciences, throughout the 19th and mid-20th
centuries. On the other hand, the appeal of qualitative research grew in the latter half of
the 20th century (Myers & Avison, 2002).
For this thesis, a qualitative approach has been adopted, given its suitability for
exploring emerging phenomena like a novel domain-driven distributed architecture for
BD systems (Myers & Avison, 2002; Creswell & Creswell, 2017). Qualitative research
allows for an in-depth understanding of the complexities and nuances involved in the
development of BD systems and their associated software components. This approach
is particularly valuable when investigating a new and evolving field, as it enables the
researcher to delve into the intricacies of the subject matter and consider the contextual
factors that influence the design and implementation of these systems.
Furthermore, the exploratory nature of qualitative research aligns well with the
aim of this study, which is to inductively build knowledge from data and interpret the
significance of the findings in relation to the broader context of BD systems development
(Merriam & Grenier, 2019).
While a quantitative approach could provide valuable insights through the evaluation
phase of the artefact, it may not fully capture the depth and breadth of understanding
required to address the research problem and questions at hand. Similarly, a mixed-
methods approach, although potentially beneficial, might not allow for the same level
of focus and immersion in the qualitative aspects of the study. By adopting a primarily
qualitative approach, this thesis aims to contribute to the understanding of domain-driven
distributed architectures for BD systems in a meaningful way while still incorporating
quantitative elements as needed to support and complement the qualitative findings.
Chapter 2. Research Methodology 42
The underlying philosophy of research, despite being largely tacit, inextricably influ-
ences the overall momentum of the research (Slife, Williams & Williams, 1995). This
research philosophy is built upon the foundational concepts of a researcher’s worldview,
which in turn, guides their approach to conducting research.
Numerous phrases are used interchangeably with the term worldview, such as
ontologies, epistemologies (Crotty, 1998), paradigms (Lincoln, Lynham, Guba et al.,
2011), and philosophies (Neuman, 2007). For the purpose of this study, worldview is
defined as the overarching philosophical framework that underpins the research.
Following the classic classification of research epistemologies provided by Chua
(1986) and more modern works of Phillips, Phillips and Burbules (2000), Crotty (1998),
Creswell, Hanson, Clark Plano and Morales (2007), Lincoln et al. (2011), Mertens
(2008), Creswell and Creswell (2017), Myers and Avison (2002), and Neuman (2007),
five philosophies frequently discussed in the literature are highlighted: 1) positivism 2)
post-positivism 3) constructivism 4) transformative, and 5) pragmatism.
A detailed description of each philosophy and its drawbacks remains outside the
scope of this study. However, a brief description focuses on the philosophy selected for
this study and its justification. Positivists perceive reality objectively, with measurable
properties to be described; thus, the observer does not influence the observed (Myers
& Avison, 2002). Positivists aim to test a theory, predicting potential outcomes. IS
research is perceived as positivist when evidence exists for quantifiable measures of
variables, drawing from samples of the stated population, and if formal propositions
are present (Orlikowski & Baroudi, 1991). This represents one of the oldest research
philosophies and aligns more with quantitative research.
Following positivism, a new school of thought, termed postpositivism, emerges that
challenges the traditional philosophy of positivism and the notion that knowledge is
Chapter 2. Research Methodology 43
absolute (Phillips et al., 2000). Postpositivism holds a key assumption that knowledge
is conjectural, suggesting a lack of absolute certainty about claims (J. K. Smith, 1983)
(Phillips et al., 2000). This philosophy centres on determinism and theory verification.
Constructivism, often merged with interpretivism, is viewed as a suitable philosophy
for qualitative research and derives mainly from the works of Berger, Luckmann et
al. (1966) and Lincoln and Guba (1985). Contrasting the two previously discussed
philosophies, constructivism adopts the stance that reality is perceived subjectively. A
central constructivist premise is that individuals form their subjective reality based on
internalised meanings, and subjective meaning is derived from cultural and historical
norms inherent in the individual’s context. Constructivism does not begin with a theory
but aims to inductively formulate one.
A philosophy that emerged during the 1980s and 1990s is the transformative phi-
losophy (Creswell & Creswell, 2017). This approach concentrates on marginalised
individuals in society. A focal point of this philosophy concerns the ways marginalised
groups experience oppression and the strategies that challenge and overturn these con-
straints (Mertens, 2019). Mixed-method research approaches, such as exploratory
convergent mixed-method research, are typically adopted.
Lastly, there is the pragmatic philosophy. Originating from the works of Peirce
(1878), who first introduced the term and laid the groundwork for the pragmatic maxim.
This philosophy was further developed by James (1981) and Dewey et al. (1917).
Pragmatism emphasises the application of concepts and how action addresses problems
(Patton, 2002). Other contributors to the development of pragmatism include Cook
(1993), Ochs (1998), and, more recently, Talisse and Aikin (2011). This worldview does
not strictly adhere to a singular system of philosophy or reality. Instead, it emphasises
the practical consequences of ideas and the importance of context in shaping our
understanding. Contemporary pragmatic approaches to research freely incorporate both
qualitative and quantitative assumptions to address research inquiries effectively.
Chapter 2. Research Methodology 44
problems inherent in the development of IS and address them to allow successful imple-
mentation of these systems within organisations (March & Smith, 1995; Nunamaker Jr,
Briggs, Mittleman, Vogel & Pierre, 1996).
Markus, Majchrzak and Gasser (2002a) and Walls, Widmeyer and El Sawy (1992)
describe DSR as an endeavour aimed at developing IS to support emerging knowledge
processes, embodied in IS design theories.
These theories are prescriptive means that facilitate effective system development
for a class of problems and are usually evaluated concerning their utility in a real-world
context (Markus, Majchrzak & Gasser, 2002b).
Therefore, based on the premises above, the purposes of this study and the inherent
properties of design science align, which are to develop and evaluate an IT artefact
designed to solve an identified organisational problem.
To attain a clear understanding of this research design, one first has to accept a
dichotomy; that is, design is both the product (artefact) and the processes (a set of
activities). Holding this dual perspective, the researcher constantly switches between
the processes involved in the design and the artefact for the same complex problem
(Walls et al., 1992). This process continues until the researcher reaches a point where
further iterations yield diminishing returns in terms of improving the artefact or the
design process.
Once this point is reached, the evaluation of the artefact begins, going through
several iterations to enhance the quality of both the product and the design process.
Throughout these stages, both the artefact and the design process evolve. In the context
of this study, the theory and design are considered to be axiomatically connected,
meaning that the theory inherently informs the design, and the design, in turn, shapes
the development of the theory.
Within the IS discipline, DSR is applied to a wide range of problems, including well-
defined, structured problems as well as more complex, ill-structured ones (A. R. Hevner
Chapter 2. Research Methodology 49
et al., 2004; Peffers et al., 2007). DSR is particularly well-suited for addressing sophis-
ticated or wicked problems (Brooks Jr, 1996; Rittel, 1984), which are characterised
by:
• The inherent uniqueness and novelty of the problem and its solution
DSR can be used for a variety of research goals, and the goal is the deciding factor in
choosing the right research design. Wieringa (2014) defines 4 major research goals,
which are artefact (re)design goals, prediction goals, knowledge goals, and instrument
design goals.
The main goal of this study is to improve the performance, scalability, maintainabil-
ity, and effectiveness of an artefact for a class of problems. Therefore, this research
falls under the design goals category. Design goals refer to the specific objectives or
criteria that a design solution must meet to be considered successful. These objectives
can vary greatly depending on the context, industry, or application of the artefact. For
example, the design goals of a product may be focused on improving its usability
and marketability, while the design goals of a software application may prioritise its
performance and reliability.
Because design goals have different objectives, they typically require a unique
design cycle that is tailored to their specific needs. This design cycle may involve
different stages or processes, such as user research, prototyping, testing, and evaluation,
depending on the goals of the design. By going through this cycle, designers can refine
and iterate their design solutions to better meet the specific goals and needs of their
users or stakeholders.
Simon (1981) is a pioneering figure in the field of design science, having significantly
contributed to its development by adapting and formalising principles from engineering
and applying them to the study of artificial systems. His work laid the groundwork
for the further evolution of design science as a discipline. He is a prominent scholar,
Chapter 2. Research Methodology 51
researcher, and thinker who has made significant contributions to a variety of fields,
including economics, psychology, and computer science. In his work on design science,
Simon emphasised the importance of a design cycle that incorporates iterative processes
of problem-solving, prototyping, and testing. He believed that design was not a linear
process but rather a continuous cycle of refinement and improvement. Simon also
stresses the importance of incorporating user feedback and evaluating the success of
design solutions against specific criteria or objectives.
Furthermore, Nunamaker Jr, Chen, Purdin, Sprague Jr and Takeda (2009) and Takeda
(2014), among other researchers in the field of DSR, build upon Herbert Simon’s ideas
on the iterative design cycle and the importance of evaluating design solutions against
specific objectives or design goals. Peffers et al. (2007) work serves as another example
of how Simon’s ideas on the design cycle have influenced and shaped the field of DSR
and how other researchers have built upon and extended these ideas over time.
The common denominator of all these seminal works is the design cycle. For the
purposes of this thesis and inspired by the aforementioned works, the design cycle of
this study follows the guidelines provided by Wieringa (2014). This cycle is made up of
three major phases: problem investigation, treatment design, and treatment validation.
Wieringa (2014) defines the design cycle as a subset of the engineering cycle, and the
engineering cycle is referred to as a ‘rational problem-solving process’. The engineering
cycle of this thesis is portrayed in Figure 2.2 and is made up of the following steps:
1. Stakeholder and goal analysis: What is the goal? Who is affected by it?
The conceptual framework used to define the elements of the engineering cycle is
defined by Wieringa (2014) and is further explored in the following subsections.
The initial point of this design cycle is the stakeholder and goal analysis. This step
is pivotal, as it sets the trajectory for the remainder of the study and highlights in-
tegral elements. Two of these elements are stakeholders and goals. In accordance
with A. R. Hevner et al. (2004) methodology, the DSR process commences with the
identification of a problem or opportunity that can be addressed through the creation of
an artefact, such as a system, process, or algorithm.
This step first involves specifying the objectives of the research and, second, identi-
fying the potential stakeholders who will utilise or benefit from the artefact. Discussion
of stakeholders and goals follows in the subsequent sub-sections.
Goals: In DSR, project goals are determined by the specific requirements of the
research undertaking. These goals can be user-centred, object-centred, machine-centred,
or focused on any other relevant aspect, depending on the research context and objectives.
While some DSR projects may prioritise stakeholder needs, others might focus on
addressing research gaps or answering specific research questions (Peffers et al., 2007;
A. Hevner & Chatterjee, 2010). It is important to recognise that there is no single DSR
methodology; instead, DSR encompasses a range of methodologies and variations that
researchers adapt to their particular research settings and aims.
Chapter 2. Research Methodology 53
Here, the goals of the DSR project are moulded by the research objectives and
questions (see Section 2.3), which might not necessarily align with a particular stake-
holder group’s needs. Hence, while stakeholder needs might be considered during the
development of this DSR project, the primary driver remains the research questions and
the potential academic contributions from the design and evaluation of a new artefact
(Peffers et al., 2007; March & Smith, 1995).
how the artefact should do it. Functional requirements are typically expressed as use
cases, which describe the specific interactions between users and the system. Non-
functional requirements may include performance requirements, security requirements,
and usability requirements.
The requirement specification designed for this thesis is made up of the following
phases:
By analysing the results of the SLR for RAs conducted in Chapter 4, several studies
have been identified that deeply explore the relevant requirements for a BD RA from
the perspective of the RA’s development methodology. These studies provide valuable
insights into the types of requirements that are essential for developing a BD RA.
In an extensive effort, the NIST BD Public Working Group embarked on a large-
scale study to extract requirements from a variety of application domains such as
healthcare, life sciences, commercial, energy, government, and defence (W. L. Chang
& Boyd, 2018). The result of this study is the formation of general requirements into
seven categories.
Additionally, Volk, Staegemann, Trifonova, Bosse and Turowski (2020) categorises
nine use cases of BD projects sourced from published literature using a hierarchical
clustering algorithm. Furthermore, Bashari Rad, Akbarzadeh, Ataei and Khakbiz (2016)
focuses specifically on security and privacy requirements for BD systems, providing a
comprehensive analysis of these critical aspects from the RA development perspective.
J.-H. Yu and Zhou (2019) present modern components of BD systems using goal-
oriented approaches, offering insights into the functional requirements of BD architec-
tures and their impact on the RA design process. Eridaputra, Hendradjaya and Sunindyo
(2014) created a generic model for BD requirements, which serves as a foundation for
understanding the overall requirements landscape in the context of RA development.
Lastly, Al-Jaroodi and Mohamed (2016) investigates general requirements to support
BD software development, contributing to the understanding of the necessary features
and characteristics of BD systems that should be considered when developing an RA.
By examining these RAs and their development methodologies, and by evaluating
the design and requirement engineering processes specific to BD RAs, a set of high-level
requirements based on BD characteristics is established. This approach ensures that the
Chapter 2. Research Methodology 60
requirements are grounded in the current state of knowledge in the field and are tailored
to the unique needs of BD RAs.
The resulting set of requirements encompasses the essential aspects of BD systems,
data volume, velocity, variety, veracity, and value, as well as critical considerations
such as security, privacy, metadata and performance. These requirements form the basis
for the design and development of the Metamycelium RA, ensuring that it addresses
the key challenges and requirements identified in the literature and aligns with the best
practices in RA development methodologies.
After clarifying the type of requirements and the relevant requirements, current BD RAs
and their requirements are assessed to gain insights into the available BD requirement
categorisation methods. The analysis of these studies reveals a common approach
to categorising requirements based on BD characteristics such as velocity, veracity,
volume, variety, value, security, and privacy (Ataei & Litchfield, 2022; Bahrami &
Singhal, 2015; Rad & Ataei, 2017b; H.-M. Chen, Kazman & Haziyev, 2016a).
The V’s of BD (volume, velocity, variety, veracity, and value) provide a multi-
dimensional framework for categorising requirements. Each characteristic represents a
distinct aspect of BD systems, and by aligning requirements with these characteristics,
a comprehensive and structured approach to requirement categorisation is achieved.
This categorisation method enables a holistic view of the requirements landscape,
ensuring that all critical aspects of BD systems are considered in the development of
the Metamycelium RA.
The Design phase is a crucial step in the methodology, occurring after the problem
identification, motivation, and objective definition stages. This phase involves two
integral components: theory and artefact.
The terms treatment and artefact are used in a specific manner, drawing inspiration
from Wieringa’s design science research methodology (Wieringa, 2014). Treatment
refers to the major steps in the design cycle, such as treatment design, treatment
validation, and treatment evaluation, whereas artefact, alongside theory, refers to the
actual created artifact that emerges from the treatment design phase.
Theory
The first output activity of this phase involves the creation of a design theory, which
represents the foundational knowledge behind the artefact that is being developed. The
design theory encompasses the requirements, constraints, and best practices that are
Chapter 2. Research Methodology 62
Artefact
The second part of the treatment design phase involves designing a solution or artefact
to address the identified problem or opportunity (see Section 6.5). The goal of this phase
is to create a design that meets the identified requirements and can be implemented and
tested.
As discussed in Section 4.6.3, there are several approaches to the systematic de-
velopment of RAs. Cloutier et al. (2010b) demonstrate a high-level model for the
development of RAs through the collection of contemporary architectural patterns and
advancements. Bayer et al. (1999) introduces a method for the creation of RAs for
product line development called PuLSE DSSA.
Stricker et al. (2010a) present the idea of a pattern-based RA for service-based
systems and use patterns as first-class citizens. Similarly, Nakagawa, Guessi, Mal-
donado, Feitosa and Oquendo (2014) presents a four-step approach to the design and
Chapter 2. Research Methodology 63
(2011b) methodology, other researchers such as Cloutier et al. (2010a) and Derras et al.
(2018a) have promoted the same ideas.
Phase 2: Selection of Design Strategy Angelov, Trienekens and Grefen (2008) and
Galster and Avgeriou (2011a) have both presented that RAs can have two major design
strategies: 1) RAs that are designed from scratch (practice driven) and 2) RAs that are
based on other RAs (research-driven). Designing RAs from scratch is rare and usually
takes place in an emergent domain that has not perceived a lot of attention (Angelov et
al., 2008). On the other hand, most RAs today are the amalgamation of a priori concrete
architectures, RAs, models, patterns, and best practices that together help shaping a
compelling artefact for a class of problems.
RAs developed from scratch tend to create more prescriptive theories, whereas RAs
developed based on the available body of knowledge tend to provide more descriptive
design theories. The RA designed for this study is a research-based RA based on
existing RAs, concrete architectures, patterns and best practices.
system at different architectural layers. This means that the architect is enhanced with an
integrated architectural tool that visualises and describes different architecture domains
and their underlying relations (M. M. Lankhorst, Proper & Jonkers, 2010; Engelsman,
Quartel, Jonkers & van Sinderen, 2011).
The variability model is depicted in Figure 2.3 using Archimate’s motivation layer.
This model is informed by the graphical notations of variability by Pohl, Böckle and
Van Der Linden (2005) and the concepts of variability management by Rurua et al.
(2019), providing architects with a tool for application to the variable components of
Metamycelium described in 6.11.
evaluation.
Research Context
This evaluation method is designed with a specific knowledge goal in mind. The goal
is to validate the artefact against a specific context to prove its feasibility, to infer its
internal mechanisms and to validate the theories discussed in Chapter 6. The current
state of BD RAs and the theoretical framework that guides the creation of the artefact’s
prototype is discussed in Chapters 3, 4, 5 and 6.
Chapter 2. Research Methodology 70
Research Problem
The conceptual framework for this research comprises architectural structures. The
framework’s validity relies on the clarity of definitions and the avoidance of mono-
operation bias (relying on a single operationalization of a construct) and mono-method
bias (using a single method of measurement). Within this context, the case mechanism
experiment specifically aims to investigate the internal workings and performance of
the Metamycelium artefact in handling BD challenges.
The experiment focuses on understanding the mechanisms, trade-offs, sensitivities,
and architectural explanations that underlie the artefact’s behaviour and effectiveness.
To achieve this, the experimental design employs open-ended knowledge questions that
probe into the artefact’s mechanisms, trade-offs, sensitivity, and architectural aspects.
Of particular interest is understanding the stimuli that trigger specific mechanisms and
how the interaction of various components within the artefact leads to the emergence of
observed phenomena.
This section focuses on the third part of the single-case mechanism experiment design,
namely, the research design and validation. In this section, inference support, repeata-
bility, and ethics are presented. These aspects are important to the overall success of the
experiment.
Inference Support
In the context of the evaluation, the inference approach adopted is abductive reasoning.
Abductive inference is well-suited for evaluating a distributed and domain-driven
BD RA due to its focus on seeking the simplest or most plausible explanations for
observed phenomena (Hoffman, 1997). The abductive reasoning process for evaluating
Chapter 2. Research Methodology 71
While abductive reasoning is chosen as the primary inference approach for evaluat-
ing Metamycelium, other inference methods were also considered. Statistical inference,
which primarily deals with drawing conclusions about a population based on a sample,
may not capture the complexities of the architecture’s behaviour. Descriptive infer-
ence, which focuses on summarising data patterns, may not provide insights into the
underlying mechanisms driving the architecture’s performance.
Analogic inferences, which involve drawing conclusions based on similarities with
other cases or domains, were deemed less suitable for evaluating a distributed BD
architecture due to the unique complexities and intricacies of the architecture and its
specific domain. The unique complexities and intricacies of the architecture and its
specific domain require a more nuanced approach like abductive reasoning, which allows
for a deeper understanding of the system’s behaviour and the underlying mechanisms
driving its performance (Wieringa, 2014).
Chapter 2. Research Methodology 72
Repeatability
To ensure repeatability and enhance the transparency and systematicity of the artefact
prototype evaluation, rigorous research methods are employed. Notably, all code
implementations related to the distributed BD architecture artefact are stored in a
publicly accessible version control repository on Github in two repositories (Polyhistor,
2023b, 2023a). This practice enables researchers and practitioners to replicate the
experiments, review the codebase, and validate the findings. By providing a centralised
and versioned repository, the artefact project fosters collaboration, knowledge sharing,
and reproducibility within the research community.
Furthermore, the explanations accompanying the artefact prototype encompass com-
prehensive guidelines and instructions, allowing independent researchers to replicate
the experiments and validate the results. The inclusion of detailed procedural descrip-
tions, datasets, and configuration parameters ensures that others can conduct similar
evaluations and compare their outcomes against the reported results.
In the Research Execution and Analysis phase, the practical application of the design
is initiated, and a prototype is developed based on the selected technologies deemed
most appropriate for the RA. This phase aims to evaluate the design in a controlled
environment, yielding data that enables an analysis of the design’s effectiveness and its
alignment with the defined research objectives. Crucial to this phase is the translation
of theoretical constructs into a functional entity. It provides empirical data for rigorous
analysis, facilitating invaluable insights for the validation or the necessary revision of
the design.
To ensure that the technology choices are justified, these decisions are aligned with
research objectives (Chapter 2) and the requirements of the artefact (Chapter 6.1). This
Chapter 2. Research Methodology 73
The technology selection process section outlines the systematic approach employed to
identify and evaluate the most suitable technologies for instantiating the Metamycelium
prototype, ensuring alignment with the research objectives and artefact requirements.
Requirement Analysis
The first step is requirement analysis, which starts by defining the precise capabilities
of the prototype needs. These are based on the requirement specification done in
Section 6.1 and the components of Metamycelium discussed in Section 2.9.4.
Evaluation Criteria
For a thorough evaluation of the potential tools, adherence to an established quality
standard is crucial. In this context, ’tools’ refers to the specific software technologies,
frameworks, and platforms that will be used to create a concrete implementation
of our Reference Architecture. The ISO/IEC 25000 SQuaRE standards (ISO/IEC
25000:2005. Software Engineering — Software product Quality Requirements and
Evaluation (SQuaRE) — Guide to SQuaRE, 2014), globally recognised for software
quality measures, are chosen to guide the tool evaluation process. Since we are selecting
technologies for instantiating a concrete architecture from the RA, these software
quality standards are directly applicable to evaluating the implementation tools and
technologies.
The SQuaRE standards encompass criteria such as functional suitability, perfor-
mance efficiency, usability, reliability, and security, among others, serving as a compre-
hensive framework for assessing software quality.
Chapter 2. Research Methodology 74
However, the entire ISO/IEC 25000 SQuaRE standards series is quite extensive and
may be time-consuming to fully apply, particularly given the resources associated with
this research. Therefore, a condensed version of the SQuaRE standards is proposed,
focusing on critical factors aligned with the research context and project requirements.
This modified framework comprises functional suitability, maintainability, com-
patibility, and portability. Functional suitability is concerned with whether the tool
functions as expected and meets the stated requirements; Maintainability refers to the
ease with which the software can be modified to correct faults, improve performance or
other attributes, or adapt to a changed environment; Compatibility explores the degree
to which a tool can perform its required functions together with other tools; and lastly,
Portability assesses the effectiveness and efficiency with which a system or component
can be transferred from one hardware, software, or operational environment to another.
For the scoring of these criteria within the evaluation matrix, a systematic and
objective approach is followed wherever possible. Each criterion is assigned a score
on a scale of 1–5, with 1 indicating poor alignment with the criterion and 5 indicating
excellent alignment.
Additionally, hands-on exploration and testing are conducted to evaluate the com-
plexity of making modifications to the tool, such as adding new features, fixing
bugs, or integrating with other components of the Metamycelium architecture.
Technology Research
The research methodology employed for the technology research phase is a lightweight,
structured literature review combined with hands-on exploratory testing. This dual
Chapter 2. Research Methodology 76
are used in that instatiation. This includes the researcher’s approach to technology
selection. In the second phase, all platforms that offer technology categorisation
and comparison are reviewed. These platforms are Apache Software Foundation
(Apache, 2023b), GitHub (GitHub, 2023), G2 (G2, 2023), Capterra (Capterra, 2023),
and StackShare (StackShare, 2023).
Additionally, prominent industry and academic sources are leveraged, each offering
distinct perspectives and insights on current trends and capabilities. These sources
encompass the “2022 Gartner Magic Quadrant for Analytics and Business Intelligence
Platforms” (2022 Gartner® Magic Quadrant™ for Analytics and Business Intelligence
Platforms, 2022), McKinsey & Company’s “Technology Trends Outlook” (Technology
Trends Outlook, 2023), MIT Technology Review Insights’ report on "Building a High-
Performance Data and AI Organisation" (Building a High-Performance Data and AI
Organization, 2023), and the annual StackOverflow Developer Survey (Overflow, 2022).
These tools are checked against the components of Metamycelium that are described in
Section 6.5.1.
The aim of the evaluation is not to evaluate all technologies in the market but to
only analyse the most supported and most used tools and technologies. The result of
the search discussed in the previous section provided a wide variety of tools. While
the majority of these tools are specifically designed for BD, some are general software
engineering tools, and some are more data science and business intelligence oriented.
The most supported and most used tools were determined based on factors such as
the number of GitHub stars, total downloads, and mentions in well-established industry
surveys like the Stack Overflow Developer Survey. The results of the technology selec-
tion research, including the chosen technologies for each component of Metamycelium,
are presented in Section 7.2 of Chapter 7.
Chapter 2. Research Methodology 78
The treatment implementation and evaluation phase of this thesis involves creating and
deploying the prototype. Evaluation of the RA ensures that it achieves the goals stated
prior to its development, tests its effectiveness and usability, and ensures it addresses
the identified problems. Key pillars of the evaluation are the RA’s correctness, utility,
and adaptability efficiency (Galster & Avgeriou, 2011a).
The RA’s quality is assessed by its transformation capability into an effective
concrete architecture. Building this RA on top of former RAs offers the advantage of
drawing insights from other studies for the evaluation process (Sharpe et al., 2019).
However, evaluating RAs presents distinct challenges. For one, RAs are abstracted
at a higher level, causing stakeholders to remain ungrouped and leading the RAs to be
more focused on architectural qualities than concrete architectures. Existing methods
designed for concrete architectures, like Scenario-based Architecture Analysis Method
(SAAM) (Kazman, Bass, Abowd & Webb, 1994), Architecture Level Modifiability
Analysis (ALMA) (Bengtsson & Bosch, n.d.), Performance Assessment of Software
Architecture (PASA) (Williams & Smith, n.d.), and Architecture Trade-off Analysis
Method (ATAM) (Kazman et al., 1998a), are ill-suited for direct application to RAs.
There are three principal challenges:
1. Lack of a clearly defined group of stakeholders for RAs makes applying existing
evaluation methods difficult (Angelov et al., 2008). Since methods like ATAM
heavily rely on stakeholder participation, the abstract nature of RAs makes
reaching diverse stakeholder groups challenging. Potential biases may also be
introduced when stakeholders from varied backgrounds participate, with some
possibly lacking architectural visions.
broad spectrum of scenarios can complicate data analysis, and prioritising these
scenarios, defining, and validating them becomes problematic. Generalizing sce-
narios might make RA evaluation incomplete and ineffective (Avgeriou, 2003a).
Such issues are noticed even while evaluating complex concrete architectures
(Bengtsson & Bosch, 1998).
functionality (see Chapter 7), the case mechanism experiment simultaneously validates
and evaluates the artefact. That is, the case mechanism experiment is achieving both
artefact validation and artefact evaluation. The validation part is highlighted by the
creation of the artefact out of the design theories discussed in Section 6.4, and the
evaluation part is highlighted by the creation of a range of scenarios and running them
against the artefact as discuseed in Chapter 7.
Following this validation and evaluation through the case mechanism experiment,
the artefact and the prototype undergo further scrutiny through expert opinion collection.
This additional step assesses the artefact’s effectiveness and usability from a practi-
tioner’s perspective (see Chapter 8). The ISO/IEC 25000 Software Product Quality
Requirements and Evaluation (SQuaRE) (ISO/IEC 25000:2005. Software Engineer-
ing — Software product Quality Requirements and Evaluation (SQuaRE) — Guide
to SQuaRE, 2014) guide the technology selection for the RA’s instantiation. In the
following sections, the research methodology employed for expert opinion collection is
discussed.
Inspired by the works of Kallio, Pietilä, Johnson and Kangasniemi (2016), the research
methodology chosen for eliciting expert feedback encompasses five distinct stages: 1)
identifying the rationale for gathering expert opinion, 2) developing the preliminary
expert opinion gathering guide, 3) desigining the research method for collecting data in
a rigorous manner and free of bias 4) pilot testing the guide, 5) presenting the results.
Given that the conceptual framework of this study is comprised of architectural
constructs, expert feedback is posited as a suitable approach (Creswell et al., 2007).
Venturing into a new domain replete with latent potential, it is postulated that these
erudite perspectives can furnish invaluable insights. The expert opinions can offer
prospective trajectories warranting exploration, thereby enhancing the robustness of
Chapter 2. Research Methodology 81
Expert opinions are collected using the software Zoom. Following each session,
recordings are downloaded along with their automatically generated transcripts. These
transcripts underwent an initial review to rectify any inconsistencies or issues before
being uploaded to the NVivo software. Each transcript is coded in accordance with a
predefined set of primary themes. These themes, detailed in Section A.5 of the guide,
Chapter 2. Research Methodology 83
The third iteration prioritised security and privacy, which are essential for safe-
guarding data and maintaining user confidentiality. As data volumes expanded, the
architecture needed to address growing potential vulnerabilities and exposures.
The fourth iteration, spurred by challenges in security measures, especially secret
management, further developed the domain-driven design. This evolution targeted
clearer inter-domain communication and defined responsibilities. By aligning architec-
tural constructs with business domains, the design ensured solutions were more in tune
with real-world challenges.
Subsequently, the architecture adopted the service mesh, a method apt for managing
the complex network of services in BD systems. This infrastructure layer streamlines
service-to-service communication. With the complexity of the architecture, the need for
observability became evident. It was not merely about monitoring but understanding
how data moved, changed, and integrated throughout the system, allowing for timely
identification and mitigation of issues.
Further iterations identified challenges in maintaining consistent standards across
distributed domains. The non-conformity by local agents jeopardised data integrity. This
highlighted the need for universal standards and policies. To enforce these standards,
an automation step was added to apply global policies consistently.
As the architecture was further refined, ensuring data consistency became pivotal.
The introduction of bitemporality and immutability addressed this. Bitemporality al-
lowed for data management across two temporal dimensions, and immutability ensured
data, once stored, remained unchanged, reinforcing data integrity and traceability.
Although these iterations trace the architectural progression, the intricate details of
each step lie beyond this study’s purview.
Chapter 2. Research Methodology 85
2.11 Conclusion
In this chapter, firstly, the characteristics of good IS research are discussed. From
there on, the research philosophy and research approaches are discussed. Further, the
embodied research design and research methods are elaborated. This chapter aims to
provide a high-level understanding of the research methodology designed for this thesis.
Building upon this methodological groundwork, the ensuing chapter delves into a
foundational narrative literature review on BD and its environs.
86
Chapter 3. A Narrative Literature Review on Big Data 87
Chapter 3
3.4 Impact of Emerging Technologies on Big Data Architectures Links to Chapter 1 (Introduction)
3.7 An Architecture-Centric Approach to Big Data Links to Chapter 4 (Big Data Reference Architectures)
3.8.1 Volume
3.8.2 Variety
3.8.4 Veracity
3.8.5 Value
3.1 Introduction
The previous chapter provided an introduction to the studies, depicting major elements
such as motivation, research methods, and philosophy. This narrative literature review
presents a comprehensive overview of BD , its significance, and its impact on the current
technological landscape. This chapter aims to establish a solid contextual framework
for the subsequent exploration of BD RAs and the development of the Metamycelium
architecture.
This review traces the evolution of BD, examining the key technological advance-
ments that have fueled its growth and the profound impact of emerging technologies on
BD architectures. It highlights the ubiquity of BD applications across diverse sectors,
underlining the transformative potential of BD and the importance of effective BD
management strategies.
The chapter also delves into the business benefits and challenges associated with BD
adoption, emphasising the need for a more comprehensive and unified approach to BD
system development. It explores the fundamental characteristics of BD, namely volume,
variety, velocity, veracity, and value, which are crucial considerations in designing
architectures that can effectively handle BD workloads.
This review begins with the Section 3.2, tracing the evolution of BD and its growing
importance. Next, Section 3.3 provides a concise historical overview of data man-
agement and BD development. The ’Impact of Emerging Technologies on Big Data
Architectures’ is examined in Section 3.4, highlighting how technological advancements
are shaping BD systems.
Section 3.5 then explores the ubiquity of BD applications across various sectors, em-
phasising its transformative potential in fields such as healthcare, social media analysis,
crime prevention, and tourism. The chapter proceeds to examine the business benefits
and challenges associated with BD adoption in the subsequent section, discussing the
Chapter 3. A Narrative Literature Review on Big Data 89
shift towards data-driven decision-making and the need for alignment between insights
and business objectives.
’An Architecture-Centric Approach to Big Data’ is addressed in its own section,
introducing the limitations of traditional data management systems in handling BD and
the need for more flexible and scalable solutions. Subsequently, Section 3.8 delves
into the key characteristics of BD, commonly known as the ’5 Vs’ (Volume, Variety,
Velocity, Veracity, and Value), elucidating how these properties shape the challenges
and opportunities in BD systems. The chapter concludes with a critical analysis of the
BD landscape in Section 3.9, highlighting gaps in current research and practice
Along the axis of technological change, there is substantial progress in the field of
software engineering. The rapid growth of this field has led to a discernible gap in the
adoption of new techniques and tools, particularly within the realm of data engineering
(Dehghani, 2022; Reis & Housley, 2022). This phenomenon, characterised by the slow
uptake of innovative software engineering advancements, has significant implications
for big data management and analysis (Ataei & Litchfield, 2023).
ThoughtWorks’ Technology Radar provides insights into the evolving landscape
of software engineering tools and techniques, highlighting a notable trend: the low
adoption rates of these advancements in the data engineering sector (ThoughtWorks,
2023). To understand how this gap emerged, we take a brief look at its history.
Initially, computers were viewed merely as tools for calculations and executing algo-
rithms. However, with the commercial availability of computers in the mid-1950s and
Chapter 3. A Narrative Literature Review on Big Data 90
their subsequent utilisation for business operations, the creation and accumulation of
large data volumes began. This shift underscored the value of data and the importance
of effectively storing and managing it.
That’s when the industry came up with the concept of a Database Management
System (DBMS), and humanity began to store data for various purposes. In 1968, as
a result of a NATO-sponsored conference, the term ‘software engineering’ emerged,
which refers to a highly systematic approach to software development and maintenance
(Wirth, 2008).
By 2005, as BD’s implications began to solidify, pivotal advancements emerged:
Yahoo developed Hadoop, a distributed processing framework; Google introduced
MapReduce, a programming model for processing and generating large datasets; and in
2009, the Indian government undertook the ambitious project of capturing the iris scans
of 1.2 billion inhabitants, epitomising the scale of BD applications.
Such large-scale initiatives underscored the inadequacy of extant data systems.
McKinsey et al. (2011) publication, "Big Data: The Next Frontier for Innovation",
further emphasised this inadequacy, triggering more investments in the BD domain.
While the growth of computational power, the emergence of open-source commu-
nities, and the widespread use of the internet have accelerated advancements in BD,
an important challenge persists: contemporary data systems are still striving for the
maturity to effectively handle the magnitude and intricacies of BD.
chitectures
suppliers, customers, and stakeholders has been amplified with the rise of digital
platforms (Bughin, 2016).
The launch of 5G technology, especially in regions such as the UK, represents more
than a mere speed advancement from 4G. Advanced features, including bi-directional
large bandwidth sharing, gigabit-level broadcasting, and support for AI-equipped wear-
ables, are poised to massively influence the data landscape (Gohil, Modi & Patel, 2013).
Such features not only increase the volume of data but also introduce complexity into
data management, storage, and processing (X. Jin, Wah, Cheng & Wang, 2015).
Such technological advancements necessitate robust, scalable, and adaptable data
architectures. The sheer volume of data generated every second poses significant
architectural challenges (Rad & Ataei, 2017c). As we navigate through the era of
technological advancements and the relentless accumulation of data, the challenge
becomes not just about gathering vast amounts of data but also making sense of it. This
scenario sets the stage for an exciting area of research, particularly in the realm of RAs.
The concept of domain-driven distributed RAs emerges as a particularly intriguing
solution, promising a structured approach to harnessing the power of BD.
Yet, despite its potential, this area remains largely unexplored, sparking a series of
questions about its effectiveness and scope. This study is motivated by the opportunity
to delve into this underexplored area, aiming to uncover how domain-driven distributed
RAs can provide a systematic framework for extracting valuable insights from BD.
Through this exploration, the aim is not only to address the challenges posed by BD but
also to turn these challenges into opportunities for innovation in the rapidly evolving
technological landscape.
Chapter 3. A Narrative Literature Review on Big Data 92
Social, commercial, and industrial trends demonstrate the ubiquity of BD. At the 2013
World Government Summit in Abu Dhabi, Joseph S. Nye, a former US Assistant
Secretary of Defence and a professor at Harvard University, highlighted the potential
for future governance in the age of information (Nye, 2013). Joseph S. Nye proposed
a scenario where central governments could use BD to strengthen control. The field
of BD is vast and multifaceted, showcasing its breadth through the substantial body of
research conducted on the subject. According to Google Scholar, from January 2010 to
August 2023, there have been 17,800 studies exploring a wide array of topics within big
data, including mathematical techniques, decision-making methods, data characteristics,
technical challenges, and adoption failures.
In the realm of social networking, a study conducted by Dodds, Harris, Kloumann,
Bliss and Danforth (2011) analysed a dataset of 46 billion words from nearly 4.6 billion
expressions posted by 63 million unique users over 33 months. This research, aimed at
understanding temporal patterns of happiness, exemplifies the extensive scale and depth
at which BD is utilised in contemporary analysis.
S. Jin et al. (2015) proposed the Distributed Community Structure Mining (DCSM)
framework for BD analysis, utilising local information data in conjunction with the
MapReduce paradigm and algorithms such as FastGN and Radetal. This approach
effectively addresses key aspects of BD challenges, including scalability, velocity, and
accuracy. In a broader social context, Durahim and Coşkun (2015) employed a sentiment
analysis model to research the overall well-being of Turkish citizens, demonstrating the
application of BD techniques in social science studies.
Similar research efforts include the analysis of suspended spam accounts on X
(formerly Twitter), focusing on profile properties and interactions to identify spammers
and malicious users using BD techniques (Almaatouq et al., 2016). Another study by
Chapter 3. A Narrative Literature Review on Big Data 93
Chainey, Tompson and Uhlig (2008) explores hotspot mapping for identifying spatial
patterns of crime. Their research concludes that hotspot mappings, utilising historical
data, can effectively pinpoint areas where crimes occur most frequently.
Moreover, Li, Yen, Lu and Wang (2012) used a large dataset from the Bank of
Taiwan to develop a BD system for identifying signs and patterns of fraudulent accounts.
They develop a detection system applying the Bayesian classification and association
rules. In a related vein, other research predicts negative behaviour spreading dynamics,
emotional response detection when browsing Facebook, and identifies the impacts on
national security using the US intelligence community datasets (Crampton, 2015).
Exploration into various domains, as depicted in Table 3.1 reveals the significant
progress made with the adoption of BD. For instance, Popescu, Iskandaryan and Weber
(2019) undertook a study with data from the Plekhanov Russian University of Eco-
nomics and HAN University of Applied Science to comprehend how BD can enhance
the understanding of multifaceted aspects of international accreditations. The study
illustrated that BD can provide fresh methodologies for these institutions, emphasising
the transformative role of BD.
Furthermore, M. Zhang, Liu and Feng (2019) delved into the potential of BD for
tour and creative agencies. The goal of this research was not just data extraction but
also the formation of strategic objectives that businesses can subsequently utilise for
tangible benefits.
As can be discerned from Table 3.1, there is an extensive amount of research on
BD’s application across diverse sectors. The table offers a snapshot of these studies,
categorising them based on their contribution, title, and domain.
Chapter 3. A Narrative Literature Review on Big Data 94
Luo, Wu, Gopukumar Application of big data in health care Health Care
and Zhao (2016)
Murdoch and Detsky Adoption of big data in health care Health Care
(2013)
Y. Zhang, Qiu, Tsai, Application of big data in cloud in health- Health Care
Hassan and Alamri care cyber-physical system
(2015)
Bates, Saria, Ohno- Exerting big data analytics to identify and Health Care
Machado, Shah and manage high-risk and high-cost patients
Escobar (2014)
K. Lin, Xia, Wang, Tian Designing systems for emotion-aware Health Care
and Song (2016) healthcare using big data
Guo and Vargo (2015) Utilising big data to examine message net- Social
works such as Twitter and traditional news
media
Chainey et al. (2008) Predicting of spatial patterns of crime us- Crime and
ing big data Fraud
Tran et al. (2018) Data driven approaches for credit card Crime and
fraud detection Fraud
Sigala (2019) A book on big data and how it can bring Tourism
innovation to tourism business
Dezfouli, Shahraki and Developing a tour model using big data Tourism
Zamani (2018)
Qin et al. (2019) Utilisation of big data with Call Detail Tourism
Record (CDR) data and mobile real-time
location data to monitor the tourist flow
and travel behavior
BD and the benefits it brings have led to novel approaches in decision-making, often
referred to as data-grounded decision-making (Comuzzi & Patel, 2016). Such organisa-
tional adaptations aim to enact meaningful interventions both internally and externally,
especially in areas that previously relied more on intuition than on data and precision
(Wamba et al., 2017).
As data-driven strategies, philosophies, tools, and methodologies develop, they
revolutionise traditional views on decision-making, business process management, and
predictive models (Popovič, Hackney, Tassabehji & Castelli, 2018).
There is, and will continue to be, immense focus on the evident advantages of BD to
understand customer behaviour patterns, experimental orientation, organisational func-
tions, business insight, predictive decision-making, and more comprehensive business
and marketing plans compared to traditional methods (van den Driest, Sthanunathan &
Weed, 2016).
Chapter 3. A Narrative Literature Review on Big Data 97
However, with profound changes come challenges. Beyond technical difficulties and
a lack of skilled personnel, transitioning to data-based management or a data-grounded
decision-making system demands a vision that aligns business goals with insights. This
alignment presents substantial potential for mistakes (Ranjan, 2019).
Investing in BD and deriving insights is essential for many competitive businesses.
Yet, it is crucial to ensure that these insights align with business objectives. From
planning to mentoring, strategizing, networking, and communicating, data can of-
fer a consolidated perspective that encompasses the field’s primary aspects, guiding
executives towards successful outcomes.
For most businesses, the business plan remains central to an effective strategy and
execution. It guides decisions on ’where to play’ and ’how to win’ (van den Driest et
al., 2016). The business plan dictates resource allocation, financial forecasting, and
outlines the roadmap for achieving targets. Here, insights can significantly influence
strategy, guide activities, and set business phases (Y.-S. Chen, 2018).
Despite BD’s prominence, most of these planning processes rely on an executive’s
judgement, which can be biassed by past experiences, cognitive patterns, emotions, and
personal beliefs. By incorporating insights into this critical phase, business activities can
align more precisely with overarching goals. That’s why high-performing companies
integrate insights into all primary decision-making stages, fostering a data-centric
culture (van den Driest et al., 2016).
Nevertheless, there is a notable lack of research into BD architecture and RAs for
data-driven systems. Most studies focus on BD analytics capability, pattern recognition,
and BD challenges. Meanwhile, other crucial areas, like insight orchestration, neces-
sary socio-technological developments, BD RAs, and data-centric artefacts, remain
overlooked (Mikalef, Pappas, Krogstie & Giannakos, 2018).
Several authors have stated that research often overlooks the concept of insight and
its role in achieving specific goals (van den Driest et al., 2016; H.-M. Chen, Kazman &
Chapter 3. A Narrative Literature Review on Big Data 98
Matthes, 2016; H.-M. Chen, Kazman, Haziyev & Hrytsay, 2015). Various aspects of
data and its potential to systematically generate value require a well-defined architecture
within the IT-business value domain (Serra, 2024b).
Insufficient research into BD system architectures and system development sig-
nificantly constrains the potential benefits of BD, leaving professionals to navigate
unfamiliar territories (Mikalef et al., 2018). This gap in knowledge becomes particularly
critical when considering the transformative changes that BD promises across various
sectors. Previously, the concept of ‘transformative changes’ was introduced in the
context of BD’s academic research potential to revolutionise industries by enabling
data-driven decision-making and innovation. Indeed, such transformative changes intro-
duce a degree of uncertainty as organisations strive to evolve into data-driven entities.
These efforts demand a high level of adaptability to fully leverage BD in the current
market (McAfee & Brynjolfsson, 2012).
The burgeoning volume, velocity, and variety of data in modern BD landscapes ne-
cessitate a reevaluation of traditional system development methodologies, particularly
within the realm of database design and data architectures. Traditional methodologies,
anchored in the ANSI standard 3-tier DBMS architecture, have focused on relational
models that emphasise structured, tabular formats and ACID (Atomicity, Consistency,
Isolation, Durability) properties to ensure data integrity and transactional consistency
(Reis & Housley, 2022). While these models offer a systematic approach to data
management and evolution within defined schema constraints, they often fall short in
accommodating the dynamic and complex nature of contemporary BD, which demands
more flexible and scalable solutions (Elmasri, 2017).
Responding to these inadequacies, the industry has increasingly embraced agile
Chapter 3. A Narrative Literature Review on Big Data 99
When confronted with the idiosyncrasies of BD, traditional relational data models,
despite their sustained dominance since the 80s with technologies like Microsoft SQL
servers, MySQL, and Oracle databases, manifest palpable inadequacies (Dehghani,
2022). These deficits become apparent in various ways, particularly in the essential
need for horizontal scaling.
In BD architectures, the software and system requirements extend beyond mere
parallelization, necessitating more advanced approaches such as clustering to achieve
optimal scalability. Additionally, these architectures need to support the heterogeneous
nature of data, which spans from unstructured to semi-structured forms. RDBMS, due
to their preference for centralised and rigid structures, often struggle to manage these
varied data forms effectively. An illustrative example is provided by MySQL (Oracle
Corporation, 2023), a widely-used traditional RDBMS.
MySQL excels at storing and querying structured data within a predefined schema.
However, when tasked with managing large volumes of unstructured data, such as
text from social media feeds or log files, MySQL’s efficiency diminishes. Its rigid
Chapter 3. A Narrative Literature Review on Big Data 100
schema-based architecture is not inherently suited for data that lacks a fixed structure,
leading to complexities in data integration and querying processes (Rodríguez-Mazahua
et al., 2016).
In response, a number of innovative technological solutions have emerged, includ-
ing Presto, a distributed SQL Query Engine (Presto, 2023); Airflow, an instrumental
platform for authoring, scheduling, and monitoring data pipelines (Apache, 2023a);
and Hadoop, epitomising open-source distributed computing (Apache, 2023c). These
developments underscore several limitations of traditional RDBMS in the BD milieu:
traditional SQL databases when dealing with Big Data’s unique demands. It high-
lighted the significant challenges these databases face, such as scalability and real-time
processing, which are not issues they were originally designed to tackle.
While traditional RDBMS systems have shown limitations in handling BD, the
data management landscape has evolved to include more sophisticated architectures
designed to address these challenges. Data warehouses, modern data warehouses, and
data lakes have emerged as solutions to manage and analyse large volumes of diverse
data (Serra, 2024a).
• Struggle with the variety and velocity of BD, often requiring significant ETL
processes
• High costs associated with hardware and licensing for large-scale deployments
Chapter 3. A Narrative Literature Review on Big Data 102
The concept of modern data warehouses has emerged to address the limitations of
traditional data warehouses in the context of BD (Krishnan, 2013). These systems
incorporate advanced features such as:
Modern data warehouses aim to combine the reliability and consistency of traditional
data warehouses with the flexibility and scalability required for BD workloads (Serra,
2024a). A prominent example of a modern data warehouse is Snowflake, which offers a
cloud-native, fully managed data warehouse solution with separate storage and compute
layers for enhanced scalability and performance (Serra, 2024a).
While addressing many challenges of traditional data warehouses, modern data
warehouses still face some limitations:
• Potential for high costs, especially with increased data volumes and compute
requirements
• Data Ownership and Domain Expertise: Centralisation often separates data from
domain experts, leading to a loss of context and potential misinterpretation of
data.
Chapter 3. A Narrative Literature Review on Big Data 103
• Lack of Agility: The centralised model can impede the agility of individual teams
or domains to innovate and evolve their data models independently.
• Data Quality Issues: With a central repository, there’s a risk of propagating data
quality issues across the entire organization if not caught early.
Data lakes represent a paradigm shift in data architecture, designed to address the
variety and volume challenges of BD. Unlike data warehouses, data lakes employ a
schema-on-read approach, allowing for the storage of raw, unprocessed data in its native
format (Serra, 2024a). This approach offers several advantages:
• Support for diverse analytical workloads, including machine learning and artificial
intelligence applications
Despite their flexibility, data lakes also face challenges (Gorelik, 2019):
• Majority of the points mentioned for the modern data warehouses in the previous
section such as data quality issues, data ownership, lack of agility, data quality
issues and increased complexity
These limitations point to the need for more advanced, flexible, and scalable archi-
tectures capable of handling the full spectrum of BD challenges. A detailed analysis
of these limitations and the emerging architectural solutions is presented in Chapter 4,
which explores reference architectures specifically designed for BD systems.
The following section will provide a detailed examination of BD’s characteristics,
thereby distinguishing it from conventional, or what is sometimes referred to as small
data.
Thus far, BD has been defined, and the elements surrounding its adoption and associated
challenges have been explored. However, a pertinent question remains: how does one
differentiate BD from small data? At which juncture does data qualify as BD? Both
academia and industry lack a universally accepted answer to this question, leading to
varied interpretations by different practitioners (H. Wang, Xu, Fujita & Liu, 2016).
A consistent theme among various definitions of BD is that a data workload is
classified as BD once it exhibits specific characteristics. The subsequent sections detail
these characteristics.
Chapter 3. A Narrative Literature Review on Big Data 105
3.8.1 Volume
The sheer volume of data can pose significant technical challenges. Architectures
must demonstrate elasticity to accommodate data of varying magnitudes. While the
process of storing and computing vast quantities of data has been addressed to an
extent, achieving efficiency remains a challenge. Within this context, there exists an
inclination towards scalable, configurable architectures that leverage distributed and
parallel processing (H.-M. Chen, Kazman & Haziyev, 2016a).
3.8.2 Variety
Variety in the context of Big Data encapsulates the multitude of data formats present
in modern computational environments. This characteristic distinguishes Big Data not
by the sheer amount of data but by the diversity of data types it encompasses. These
span structured, semi-structured, and unstructured formats, including, but not limited
to, JSON and XML for semi-structured data; traditional databases and CSV files for
structured data; and text, images, videos, and logs for unstructured data.
Historically, databases have been designed to handle structured data, notably through
RDBMS that utilise tables, rows, and columns. However, these traditional systems were
not always adept at managing the burgeoning unstructured data formats. Binary Large
Objects (BLOBs) and Character Large Objects (CLOBs), which predate the 2000s,
serve as data types to accommodate larger data formats. For instance, BLOBs are apt
for multimedia formats such as videos, images, and sounds. These are not storage
solutions per se but represent formats within databases.
The limitations of RDBMS in handling diverse data led to the advent of NoSQL
databases. Contrary to misconceptions, NoSQL databases do not mark an evolution but
a divergence from RDBMS, catering specifically to varied data structures. MongoDB,
Chapter 3. A Narrative Literature Review on Big Data 106
3.8.3 Velocity
Velocity in the context of BD refers to the speed at which new data is generated and
collected. This encompasses the challenges of processing real-time data and making
rapid decisions, distinct from issues of volume or variety. Addressing these challenges
necessitates specific architectural decisions. Among the prominent architectures devel-
oped to handle such challenges are the Lambda Architecture (Marz & Warren, 2015),
designed for real-time data processing, and the Kappa Architecture (Kreps, 2014a)
Chapter 3. A Narrative Literature Review on Big Data 107
3.8.4 Veracity
Veracity pertains to the trustworthiness and authenticity of the data. Poor quality data,
which may be incomplete, unreliable, or outdated, poses significant challenges for BD
processing. Ensuring data integrity requires rigorous data cleansing, modelling, and
governance processes.
Data cleansing refers to the process of detecting and correcting (or removing)
errors and inconsistencies in data to improve its quality. Modelling, in this context,
pertains to the establishment of data standards and formats, ensuring consistent data
structures. Governance encompasses the overarching set of processes, policies, and
standards that ensure data quality throughout its lifecycle (Eryurek, Gilad, Lakshmanan,
Kibunguchy-Grant & Ashdown, 2021).
Legal and ethical challenges arise when data is acquired from unauthorised or
dubious sources, leading to concerns about privacy and data security. Thus, veracity
encompasses both the intrinsic quality of the data and the context of its acquisition.
Factors defining data trustworthiness include the collection method, the data origin, and
the platform used for processing. Data consistency, on the other hand, can be gauged
through statistical reliability measures (Demchenko, Grosso, De Laat & Membrey,
2013).
Addressing veracity in BD involves several essential components:
3.8.5 Value
Value is pivotal among the BD characteristics, as it is through value that the potential of
data is fully realised. To derive value from data, an integrative approach to both storage
and computing is essential. The concept of value in this context refers to the extraction of
knowledge, contingent on various events or processes and their interdependencies. Such
events or processes might manifest in diverse forms, including stochastic, probabilistic,
regular, or random natures (Demchenko et al., 2013).
The narrative literature review presented in this chapter has explored the evolution of
BD, its impact on architectures, applications across various domains, and the challenges
associated with its adoption. While the review highlights the significant potential
of BD and its transformative impact across industries, it also reveals several gaps,
inconsistencies, and areas for further research.
One notable issue is the lack of consensus on the definition and characterization of
BD. While the "5 Vs" (Volume, Variety, Velocity, Veracity, and Value) are commonly
used to describe BD, there is no universally accepted definition. This inconsistency
Chapter 3. A Narrative Literature Review on Big Data 109
3.10 Conclusion
In summary, this chapter provides a overview of BD, detailing its evolution, key charac-
teristics and widespread impact across various domains. It highlights the transition from
traditional data management systems to advanced architectures necessary for handling
Chapter 3. A Narrative Literature Review on Big Data 110
Chapter 4: A Systematic Literature Review on Big Data Reference Architectures 4.5 Review Methodology
4.6 Findings
4.7 Discussion
111
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
112
4.1 Introduction
In the preceding chapter, the significance of BD was explained, spotlighting its omnipres-
ence, associated challenges, and key defining characteristics. Specific components,
including the exact definition of BD and criteria distinguishing a BD workload, were
delineated. Building on this foundation, the current chapter embarks on a SLR on BD
RAs.
The advancements in software technologies, digital devices, and networking in-
frastructures have bolstered users’ ability to produce data at an unprecedented rate.
In today’s data-centric era, the continuous generation of structured, semi-structured,
and unstructured data, when meticulously analysed, can reveal transformative patterns
(Ataei & Litchfield, 2020).
As discussed in Section 1.2.2, the proliferation of data has ushered an ecosystem
of technologies, BD being a notable member (Rada et al., 2017b). Despite growing
interest, integrating BD into existing systems presents numerous challenges. Many
organisations struggle to effectively implement BD and realise its benefits, as evidenced
by various surveys.
These implementation challenges are further underscored by recent industry surveys.
As previously noted in Section 1.3.2, these reports reveal low success rates in data
strategy implementation, difficulties in transitioning to BD, and frequent abandonment
of data-related projects due to processing delays.
BD challenges encompass a range of issues, from the lack of business context
and rapid technological changes to organisational challenges, data architecture, and
talent shortages. Though such challenges are not unique to BD, they are accentuated
in its realm, given the complexities of BD engineering, real-time processing demands,
scalability prerequisites, and data sensitivities.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
113
Presently, many BD systems are developed based on ad-hoc and intricate architec-
tural solutions, which can sometimes diverge from commonly accepted best practices in
software architecture and software engineering (Gorton & Klein, 2015). Such inconsis-
tency can lead to suboptimal decisions, hindering the evolution of BD systems. Given
the limitations of an ad-hoc design strategy, this study strives to provide a comprehen-
sive SLR on BD RAs, emphasising their potential in fostering a coherent approach to
BD system development. This SLR aims to address RQ1 directly and RQ2 indirectly.
The chapter commences by underscoring the necessity of RAs and progresses to
explore the challenges tied to the creation of RAs. Thereafter, it dissects the common
architectural components of BD RAs, pinpointing their inherent limitations.
This chapter is structured as follows: “Why Reference Architectures?” discusses
the indispensable role of RAs in grappling with complex challenges and highlights
real-world examples (Section 4.2). The subsequent section, "Reference Architectures
State of the Art", delves into the current BD system RAs, spotlighting their unique
attributes (Section 4.3).
"Objective of the SLR" clearly outlines the research questions driving this SLR
and the overarching aims (Section 4.4). "Review Methodology" details the research
methods adopted for this SLR (Section 4.5).
Following this, "Findings" presents the distilled insights and key takeaways from the
SLR (Section 4.6). A "Discussion" section engages in a rigorous analysis of the SLR’s
results (Section 4.7). Recognising the importance of credibility, the chapter assesses
"Threats to Validity", addressing the robustness and rigour of the findings (Section 4.8).
Finally, a "Conclusion" encapsulates the essence of the SLR and its implications for the
field (Section 4.9).
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
114
Despite the evident advantages of using RAs and their potential to address the complex
challenges of BD systems, there is a noticeable development gap in this area. Both
academic literature and industry findings suggest a need for increased focus. This
observation stems from a systematic review of the current body of knowledge and an
exploration of available BD RAs (Ataei & Litchfield, 2020).
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
115
architectures. The variability in definitions and the lack of a unified approach to BD RAs
underscore the importance of a SLR of the current landscape. This gap in knowledge
leads us to the objectives of the SLR.
Considering the documented failure rate of BD projects, there is a need to explore the
potential of RAs to enhance system development and BD architecture for better project
outcomes. A previous SLR in this domain was conducted by Ataei and Litchfield
(2020). This SLR found that RAs can be pivotal in addressing the complexities of BD
system developments, serving as a guide, and promoting the application of software
engineering knowledge and patterns.
Building upon the insights of the aformentioned SLR, the objective of the current
SLR is to identify and aggregate the BD RAs from the existing body of knowledge,
emphasise their architectural commonalities, and delineate their limitations. At the
foundation of this SLR lies the broader research direction of the thesis, which concerns
the architectural failure modes in BD projects and the role of RAs in addressing these
issues, as outlined in Section 2.3.
While the thesis research questions provide a high-level strategic perspective, the
research questions of this SLR delve into the operational details of BD RAs, presenting
a tactical view that complements the overall research aims. While the thesis research
questions offer a broad strategic view, the SLR research questions focus on specific
operational aspects of BD RAs. This approach provides a detailed examination that
supports the overall research goals. By combining high-level strategic analysis with
specific empirical findings, we aim to conduct a thorough study of BD architectures.
The following research questions have been formulated to guide this SLR:
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
117
SLR–RQ1 Which BD RAs are currently available in both academic and industrial
contexts?
These questions are formulated based on the identified gap in the body of knowledge
pertaining to BD RAs. The questions not only strive to encapsulate current best practices
in architecting BD systems but also aim to elucidate the limitations and challenges
faced in the present-day context of BD RAs.
This research adheres to the guidelines of PRISMA (Page et al., 2021). Furthermore,
PRISMA-S (Rethlefsen et al., 2021) is incorporated to refine the search strategy. In
addition, the guidelines from Kitchenham et al. (2015) for evidence-based software
engineering and systematic reviews are also employed. Even though PRISMA offers
comprehensive guidelines for conducting SLRs, its origin in the healthcare community
brings with it assumptions that may not align perfectly with the needs of software
engineering and information system researchers.
In response, Kitchenham et al. (2015) have contextualised many of these assump-
tions for the domain of software engineering, offering guidance especially for individual
researchers and projects with limited resources.
PRISMA serves as the foundation for this research design, complemented by other
methods to avoid bias, enhance transparency, and facilitate reproducibility of our
systematic approach. An SLR is selected as the methodological approach as it offers a
qualitative lens to advance knowledge and understanding around emergent topics and
their associated elements. Additionally, SLR delivers a transparent and reproducible
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
118
procedure that identifies patterns, relationships, and trends and paints a holistic picture
of the subject (Borrego et al., 2014).
The principal aim of this study revolves around assessing the current landscape of
BD RAs, pinpointing their primary architectural components, highlighting prevailing
theories, and discussing inherent limitations. This aim is methodically pursued across
four phases. The initial phase involves stating the research questions, defining the exclu-
sion and inclusion criteria, identifying and pooling relevant literature, and constructing
a quality framework.
During the subsequent phase, study titles undergo evaluation against the established
inclusion and exclusion criteria, followed by an assessment of the filtered studies’ title,
abstract, introduction, and conclusion. Thereafter, each study undergoes a comprehen-
sive analysis against the criteria set in the quality framework. In the third phase, the
chosen literature is systematically coded as per the research questions. Finally, the
findings undergo a thematic synthesis, and the resulting themes are elucidated.
This research extends upon the SLR conducted by Ataei and Litchfield (2020) by
encompassing the years 2020 to 2022. Contrasting with the work of Ataei, the present
study employs thematic synthesis, intending to provide a more granular examination of
BD RAs and their characteristics.
4.5.1 Identification
The first phase of the SLR involves the adoption of PRISMA-S (Rethlefsen et al.,
2021) to formulate a comprehensive multi-database search strategy. This extension
of PRISMA furnishes a 12-item framework that augments transparency, systematicity,
and minimises bias in the search approach. For this study, the following electronic
databases are investigated: ScienceDirect, IEEE Explore, SpringerLink, AISeL, JSTOR,
and ACM library. To achieve the objective of identifying all available literature on the
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
119
topic and ensuring no valuable research is missed, abstract and citation databases and
search engines such as Google Scholar and Research Gate are also utilised.
Furthermore, a search of the grey literature is conducted. Grey literature refers to
research outputs not formally published in academic books or journal articles. The topic
of interest is "big data reference architectures". Using the search string "big data"
AND "reference architecture*" on Google, the first 40 results are selected
for screening. The search is executed in incognito mode’ to mitigate the influence of
any personalised customisation of the search results. Reference lists of selected studies
undergo manual screening to pinpoint additional relevant research, ensuring the crucial
component of completeness’ for SLRs, as articulated by Kitchenham et al. (2015).
Moreover, academic platform search capabilities may differ, but the search strategy
remains largely consistent. For example, when a platform does not support wildcards
(like asterisks), the terms are searched in both singular and plural forms. A notable
exception is SpringerLink, which does not accommodate bulk downloads of references
in BibTex format. The keywords for the databases are:
selection, and a language limit is implemented through an advanced search with the
aforementioned keywords.
To systematically gather evidence, databases are searched using the specified key-
words, followed by bulk downloading of the BibTex files. Only SpringerLink, Google
Scholar, and Research Gate deviate from this procedure. For SpringerLink, studies are
downloaded in CSV format and then transitioned to BibTex via a custom script. For
Google Scholar and ResearchGate, each study’s bib file is manually crafted.
Upon creation of all bib files, they are consolidated into a singular bib file and
imported into the JabRef software for deduplication. Initially, 172 studies are pooled,
but six duplicates are identified and removed. Additionally, the foundational SLR for
this study and another uncited paper are excluded. Conversely, 5 white papers and
4 website blogs are incorporated. By the phase’s conclusion, 173 studies had been
amassed.
The first stage of screening begins with assessing the title, abstract, and keywords of the
pooled studies. For grey literature, only the title is considered. This assessment relies
on specific inclusion and exclusion criteria. The inclusion criteria are as follows:
• Primary and secondary studies (including grey literature) between January first,
2010 and August first, 2023 on the topics of BD RA, BD architecture, and BD
architectural components.
• Research that indicates the current state of RAs in the field of BD and demon-
strates possible outcomes.
• Studies that are scholarly publications, books, book chapters, theses, dissertations,
or conference proceedings.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
121
• Duplicate reports of the same study (a conference and journal version of the same
paper).
In the second stage, after excluding papers based on the criteria, and as suggested
by Kitchenham et al. (2015), the studies undergo quality assessment. The quality of the
evidence collected as a result of this SLR directly influences the quality of the findings,
emphasising the importance of quality assessment.
However, assessing quality presents well-known complexities. Among the most
fundamental are defining ‘quality’ and appraising the quality of conference papers,
which often lack comprehensive details on research methodology and evaluation. In
general, a study’s quality relates closely to its research method and the validity of its
findings. From this perspective, inspired by the works of Noblit, Hare and Hare (1988)
on meta-ethnography and Dybå and Dingsøyr (2008), studies’ quality is assessed by the
degree to which the conduct, design, and analysis of research are prone to systematic
errors or bias (Cumpston et al., 2019). The more bias in the selected literature, the
higher the likelihood of misleading conclusions.
Given the diverse nature of software engineering and IS papers, and the challenge
of defining quality in studies of various natures, the analysis initially considers several
well-established checklists, such as the Critical Appraisal Skills Programme (CASP)
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
122
(Critical Appraisal Skills Programme, 2023), and JBI’s critical appraisal tool (Munn et
al., 2020). However, recognising the need for criteria tailored to software engineering
and IS, this research refers to the checklist provided by Runeson, Andersson, Thelin,
Andrews and Berling (2006) for software engineering case studies. Similarly, Dybå and
Dingsøyr (2008) proposes quality criteria based on the CASP checklist for qualitative
studies in software engineering systematic reviews.
Despite these resources, the challenge remains that this research comprises numerous
study types that must adhere to a single checklist. To address this concern, a criteria
set consisting of seven elements is developed. These criteria are informed by the
CASP’s recommendations for assessing qualitative research quality (Critical Appraisal
Skills Programme, 2023) and by guidelines provided by Kitchenham et al. (2002) on
empirical research in software engineering. The seven criteria test literature in four
major areas that can significantly impact the studies’ quality. These categories and their
corresponding criteria are:
(a) Whether the study reports empirical research or is merely a report based on
expert opinion.
(b) Clear communication of the study’s objectives and aims, including the
rationale for undertaking the study.
2. Rigour:
3. Credibility:
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
123
4. Relevance:
Collectively, these seven criteria measure the extent to which a study’s findings
might contribute valuably to the review. The criteria serve as a checklist, with each
property being dichotomous, that is, yes’ or no’, and the assessment takes place in
two phases. In the initial phase, the assessment focuses solely on the first major area:
the minimum quality threshold. If a study surpasses this phase, the next assessment
includes credibility, rigour, and relevance.
Another challenge encountered relates to the fact that a PhD thesis typically in-
volves a single researcher, making it infeasible to apply statistical methods like Cohen
(1960) κ and Krippendorff (1970) α. Instead, the test-retest approach, as suggested by
Kitchenham et al. (2015), is employed.
Following this approach, papers undergo an initial assessment and then a subsequent
assessment at a later time. A study’s quality is deemed satisfactory if 75% of the
responses are positive with at least 75% inter-rater reliability across responses, which
encompasses feedback from both the test and retest phases.
It should be noted that this quality framework does not apply to grey literature,
which undergoes assessment solely based on inclusion and exclusion criteria. During
the identification phase of this SLR, a total of 138 pieces of literature from academia
and 24 from grey literature were amassed. Some literature joins the pool through the
process of forward and backward searching. For example, upon reviewing the NIST
RA, additional references from Oracle, Facebook, and Amazon are incorporated into
the literature pool.
In the screening phase, any literature not aligning with the inclusion and exclusion
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
124
criteria is discarded. For instance, if a paper is excessively brief and fails to address BD
RA, its ecosystem, or its limitations, it gets excluded. This phase results in the exclusion
of 50 papers. Subsequently, by evaluating studies against the quality framework, 21
academic studies and 12 from the grey literature pool are excluded. The process is
visually represented in Figure 4.1.
At this stage, research questions get established, inclusion and exclusion criteria are
defined and applied, the quality assessment framework is developed, and the synthesis
of data commences. An essential component of this phase is data extraction, during
which the essence of the studies is acquired in an explicit and consistent manner.
Prior to the synthesis of the data, guidelines proposed by Cruzes and Dyba (2011)
for data extraction are employed. Data extraction begins by immersing in the entire
pool of literature (V. Braun & Clarke, 2006). From there, a structured reading approach
ensues, extracting three types of data:
This process poses challenges as some studies fail to adequately describe the method,
contextual information often lacks detail, and evaluation methods differ. Upon complet-
ing data extraction, the coding process begins. There are several potential approaches: a
deductive or a priori approach (Miles & Huberman, 1994) and an inductive or grounded
theory approach (Corbin & Strauss, 2014).
Both of these approaches are long-established and can be effective. The software
Nvivo organises files, and an initial set of a priori codes based on research questions is
created. These codes include:
1. BD RAs (SLR–RQ1)
During the coding, it becomes apparent that some fundamental areas might not have
been well-established in academia and practice. Despite mentions of these concepts,
descriptions are often sparse. Therefore, four additional codes are introduced:
4. Challenges in BD RA development
Once all the literature is coded, categorisation by themes ensues. Themes help
categorise the data. This process involves the integration of initial codes into higher-
order ones, which sometimes requires the rearrangement and reclassification of codes.
The end of this process occurs when no new themes emerge, and many of the initial
themes get incorporated into higher-order themes.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
127
The final step in data synthesis is the creation of a model based on higher-order
themes to delineate relationships and address the research questions of this SLR. The
outcome is a theory, alignment with previous theories, and the identification of relation-
ships. A challenge in this phase arises from heterogeneity, attributed to the inclusion of
grey literature and the variety of methodologies in software engineering research.
To ensure the robustness of the higher-order themes, the primary sources of variabil-
ity are identified as follows:
1. Variability of outcomes (some RAs evaluated in practice, while others are merely
compared against different RAs),
1. Credibility: Does the research align with research questions? Does the thematic
synthesis encompass the data adequately?
The detailed results of this process, including additional codes defined, themes
identified, and specific examples illustrating data extraction challenges, are presented in
the subsequent Section 4.5 (Findings). This upcoming section provides a comprehensive
overview of the outcomes from our systematic review process.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
128
4.6 Findings
This section maps findings against the research questions in several sub-sections. To
enhance clarity, these sub-sections align with the research questions and models gen-
erated in the prior phase. The section starts by discussing fundamental concepts such
as RAs and their significance in BD system development and subsequently delves into
specific topics like current BD RAs and their limitations.
With the increasing complexity of systems, the principles and concepts of software
architecture become essential tools to address the challenges faced by practitioners
(Ataei & Litchfield, 2020). Representing a system using architectural concepts eases
the comprehension of the system’s essence, the properties it possesses, and its evolution.
This representation influences quality attributes such as performance, maintainability,
and scalability.
In recent years, IT architectures have been instrumental in the development and evo-
lution of systems. These architectures assist in the maintenance, planning, development,
and cost reduction of intricate systems (Martinez-Prieto, Cuesta, Arias & Fernández,
2015). An architecture provides clarity about the fundamental components of the
system, guiding development to meet specific requirements (Sievi-Korte, Richardson
& Beecham, 2019). This delineation produces manageable components to address
various facets of the problem and offers stakeholders an abstract artefact for reflection,
contribution, and communication (Kohler & Specht, 2019).
Successful IT artefacts often derive from an effective RA. Notable examples include
the Open Systems Interconnection Model (OSI) (Zimmermann, 1980), Open Authenti-
cation (OATH) (OATH, 2007), Common Object Request Broker Architecture (CORBA)
(Pope, 1998), and workflow management systems (WMS) (Greefhorst, 1999). In fact,
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
129
1. RAs operate at the highest level of abstraction: RAs encapsulate the essence
of the practice, portraying elements vital for communication, standardisation,
implementation, and maintenance of certain classes of systems. Therefore, RAs
focus on high-level architectural patterns and do not delve into details such as
specific frameworks or vendors. This positions RAs at a higher level of abstraction
than concrete architectures.
stakeholders could reduce their efficacy (Ataei & Litchfield, 2020; W. L. Chang
& Boyd, 2018).
Despite the high failure rate of BD projects, IT giants such as Google, Facebook, and
Amazon develop exclusive BD systems with intricate data pipelines, data management,
data procurement, batch and real-time analysis capabilities (Kohler & Specht, 2019).
These companies, with their vast resources, attract top talent globally to handle the
complexity inherent in the development of BD systems. However, many organisations
strive to benefit from BD analytics without such advantages.
BD systems deviate from traditional small data analytics paradigms, introduc-
ing various challenges including rapid technological changes (H.-M. Chen, Kazman,
Garbajosa & Gonzalez, 2017), system development, and data architecture challenges
(H. V. Jagadish et al., 2014), along with organisational challenges (Rada et al., 2017b).
Furthermore, BD systems are distributed by nature and account for multiple types of
data processing, typically batch and stream processing. Coupled with the intricacy
of maintaining and scaling data quality, metadata management, data catalogues, data
dimension modelling, and data evolvability, BD system design proves intricate.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
131
5. Two approaches are required for data processing, stream and batch processing; or
fast and delayed processing.
Findings from this study indicate a limited number of frameworks available for the de-
sign and development of RAs. A frequently used approach is the Empirically Grounded
Reference Architectures by Galster and Avgeriou (2011a). This research methodology
gains recognition for its emphasis on empirical validity and foundation.
This methodology is comprised of six steps which are respectively: 1) Selecting the
type of the RA, 2) Selection of the design strategy, 3) Empirical acquisition of data, 4)
Construction of the RA, 5) Enabling RA with variability, 6) Evaluation of the RA.
Another notable contribution in this domain is a framework for the analysis and
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
133
design of software RAs by Angelov et al. (2012). The framework employs a multi-
dimensional classification space to categorise RAs and presents five major types. Devel-
oped to support RA analysis concerning their architectural specification/design, goal,
and context, the method identifies three main dimensions, each with corresponding
sub-dimensions of design, goal, and context.
These dimensions and sub-dimensions stem from interrogatives such as ‘why’,
‘where’, ‘who’, ‘when’, ‘what’, and ‘how’. The question ‘why’ focuses on the RA goal,
while ‘who’, ‘when’, and ‘where’ address context, and ‘how’ and ‘what’ concentrate
on design dimensions. This framework divides RAs into two primary categories:
facilitation RAs and standardisation RAs.
Volk, Bosse, Bischoff and Turowski (2019) employ the Software Architecture Com-
parison Analysis Method (SCAM) to evaluate RAs based on their applicability. The
outcome of this research is a decision-support process for BD RA selection. Notably,
two frequently observed standards are ISO/IEC 25010, used to select quality soft-
ware products for RAs (Iso, 2011), and ISO/IEC 42010 for architecture description
(International Organization for Standardization (ISO/IEC), 2017).
Evidence from this SLR reveals that many researchers and practitioners rely on
informal architectural description methods, such as boxes and lines, with a notable
exception in Geerdink (2013). Geerdink employs ArchiMate (Josey, Lankhorst, Band,
Jonkers & Quartel, 2016), a formal and standardised architectural description language
endorsed by ISO/IEC 42010. Informal modelling methods might lead to discrepancies
between system design and implementation (Zhu, 2005), lack adherence to established
standards, and do not further the development of modelling methodologies.
Thus, emphasising the use of standard architectural description languages for dis-
cussing and illustrating ontologies becomes evident. In conclusion, Geerdink (2013)
utilises Hevner’s IS research framework (A. R. Hevner et al., 2004) to develop an RA,
aligning with the perspective that a BD RA is an information system artefact grounded
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
134
Among the challenges of developing RAs, evaluation stands out as particularly signifi-
cant (Maier, Serebrenik & Vanderfeesten, 2013). Galster and Avgeriou (2011a) indicate
that the fundamental pillars of evaluation are the correctness and the utility of the RA,
as well as its adaptability and instantiation efficiency.
RAs and concrete architectures exhibit different levels of abstraction and possess
distinct qualities. Although many well-established evaluation methods exist for concrete
architectures, such as Architecture Level Modifiability Analysis (Bengtsson & Bosch,
n.d.), Scenario-based Architecture Analysis Method (Kazman et al., 1994), Architecture
Trade-off Analysis Method (Kazman et al., 1998a), and Performance Assessment of
Software Architecture (Williams & Smith, n.d.), none of these methods can be directly
applied to RAs.
For example, ATAM depends on stakeholder participation in the early stages for the
creation of a utility tree. Given the high level of abstraction inherent to RAs, identifying
a clear group of stakeholders at this stage becomes challenging. Moreover, many
evaluation methodologies employ scenarios, but due to the abstract nature of RAs and
their potential applicability in various contexts, creating valid scenarios proves difficult.
This leads to either the development of a few general scenarios covering all aspects or
the creation of numerous specific scenarios for various RA facets. Both approaches
introduce potential threats to validity.
Given the issues highlighted above, existing architecture analysis methods appear
insufficient for evaluating RAs. Various research endeavours aim to address this shortfall.
For instance, Angelov et al. (2008) adapted ATAM to better suit RAs by inviting industry
representatives to participate in the evaluation process. This revised process entailed
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
135
s11 SAP - NEC Reference Architecture for SAP HANA & Practice 2016
Hadoop (SAP, 2016)
s12 Big data architecture for construction waste analytics (CWA Academia 2016
): A conceptual framework (Bilal et al., 2016)
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
137
s13 A reference architecture for Big Data systems in the national Academia 2016
security domain (Klein et al., n.d.)
s16 Simplifying big data analytics systems with a reference ar- Academia 2017
chitecture (Sang, Xu & Vrieze, 2017)
s17 NIST Big Data interoperability framework (W. L. Chang & Practice 2018
Boyd, 2018)
s18 Extending reference architecture of big data systems to- Academia 2020
wards machine learning in edge computing environments
(Pääkkönen & Pakkala, 2020)
s19 A Big Data Reference Architecture for Emergency Manage- Academia 2020
ment (Iglesias, Favenza & Carrera, 2020)
s22 NeoMycelia: A software reference architecture for big data Academia 2021
systems (Ataei & Litchfield, 2021b)
Over the past years, the BD domain has garnered significant attention, particularly
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
138
in BD system development. Building upon the White House initiative mentioned in Sec-
tion 4.3, in March 2012, the White House announced an initiative for BD research and
development (Kalil, 2012). This initiative aimed to accelerate scientific and engineering
discovery, enhance national security, and facilitate knowledge extraction from extensive
and complex data sets (W. L. Chang, Grady et al., 2015). This project received support
from six federal departments with an investment exceeding 200 million USD.
In June 2013, the NIST launched the NBD-PWG (W. L. Chang et al., 2015). This
initiative saw widespread participation from various sectors, including practitioners,
researchers, agents, government representatives, and non-profit organisations.
A notable outcome of this project was the NIST BD RA (NBDRA). The US
Department of Defense states that the primary objective of NBDRA was to offer an
authoritative BD information source to guide and regulate practice. This RA is among
the most recent and comprehensive in the BD field. NBDRA consists of two fabrics that
define five functional logical components interconnected by several interfaces. These
components represent the interconnected nature of security, privacy, and management.
Along these lines, major IT corporations have also released their own BD RAs.
This SLR identified eight BD RAs from the industry, primarily sourced from white
papers. These papers originate from companies such as IBM, Microsoft, Oracle, SAP,
and ISO. Among these industry-developed RAs, the Lambda architecture is one of the
most discussed and studied among these RAs. Some BD RAs in practice were omitted
as they were deemed outdated or insufficiently detailed. For instance, the RA published
by Amazon on AWS’ official documentation page (Amazon Web Services, 2024) was
omitted as it lacks sufficient detail.
In academia, there have been a few contributions, such as a postgraduate master’s
dissertation (Maier et al., 2013) and a PhD thesis (Suthakar, 2017). Some universities,
including the University of Amsterdam, have introduced their own BD architecture
frameworks (W. L. Chang et al., 2015).
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
139
Numerous RAs have also been developed for specific domains. These architectures
often appear in short journal papers, with many authors intending to elaborate further in
subsequent publications. For example, Klein, Buglak, Blockow, Wuttke and Cooper
(2016) introduced a BD RA for the national security domain, while Weyrich and Ebert
(2015) focused on the IoT domain.
Despite the efforts in this field, a noticeable scarcity of comprehensive BD RAs
exists. The aforementioned studies primarily offer brief discussions on RAs in spe-
cific domains without delving deeply into quality attributes, data quality, metadata
management, security, or privacy.
To address SLR–RQ2, RAs listed in Table 4.1 undergo review and comparison to
highlight the common architectural components of BD RAs. Some RAs like the works
of Klein et al. (2016) present in the form of a short paper, while others, such as NIST,
provide comprehensive insight.
Many RAs draw inspiration from or base their structures on other RAs. For instance,
the RA ISO ISO20547 (for Standardization (ISO/IEC), 2020) has driven many com-
ponents from the RA published by NIST (W. L. Chang et al., 2015).This signifies the
notion that RAs often derive their effectiveness from existing knowledge rather than
original constructs.
In a systematic approach to this inquiry and following data extraction, all the
components from the BD RAs listed in the Section 4.6.5 are tabulated in Table 4.2.
These components are the exact names of the architectural components used in the
diagrams presented in each study.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
140
RA Components
s9 Access Manager, Intel Big Data Analysis Platform, Data Ingestion, Dat Sources
s10 Data Sources, Data Extraction, Data Loading and Pre-Loading, Data Processing,
Data Storage, Data Analysis, Data Loading and Transformation, Interfacing and
Visualisation
s11 Data Input sources, Data Processing Platform, Processed Data for Client
s13 Data Providers, Big Data Application Layer, Big Data Framework Provider,
Data Consumers
s14 Data Generation, Data Streams, Data Storage, Stream Processing, Data Ware-
house, Hadoop Cluster, Machine Learning, Presentation
RA Components
s16 Data Source, Data Integration, Data Analysis and Aggregation, Interface/Visu-
alisation
s17 Data Provider, System Orcehstrator, Big Data Application Provider, Big Data
Framework Provider, Security and Privacy Fabric, Management Fabric, Data
Consumer
s18 Data Sources, Data Extraction, Data Loading and Preprocessing, Data Process-
ing, Data Storage, Model Development and Interface, Data Transformation and
Serving, Interacing and Visualisation
s19 Data Provider, Big Data Application Provider, Big Data Framework Provider,
System Orchestrator, Management Fabric, Security and Privacy Fabric, Data
Consumer
s20 Big Data Application Provider, Big Data Processing Layer, Big Data Platform
Layer, Big Data Infrastructure Layer, Integration, Security and Privacy, System
Management, Big Data Provider, Big Data Consumer
s21 Acquisition Layer, Refinment Layer, Scrutiny Layer, Training Layer, Insight
Layer
s22 Gateway, Stream Processing Service Mesh, Stream Processing Controller, Moni-
toring, Service Discovery, Query Controller, Batch Processing Controller, Batch
Processing Service Mesh, Event Backbone, Data Lake, Query Engine, Event
Archive, Semantic Layer, Control Tower, MicroService, Sidecar, Event Queue
Each study opts for different terminology to describe its architectural components.
There appears to be no standardised method for modelling BD RAs. Utilisation of
architectural definition languages like Archimate remains infrequent, with many studies
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
142
opting for specifically defined ontologies depicted using boxes and lines. This non-
standard approach complicates the understanding and comparison of these RAs, often
necessitating translation between ontologies.
An automated text analysis via Nvivo on the names of these architectural compo-
nents helps identify commonalities and patterns in word usage. The results from this
analysis can be visualised in a word cloud, as seen in Figure 4.3.
Among the names used for components, big data application provider (five occur-
rences) and big data framework provider (three occurrences) appear most frequently.
This prevalence stems from some RAs being based on NIST BD RA (s17) and sub-
sequently adopting its terminology. One term universally used across studies is data
consumer and data provider. Furthermore, most studies prefer the term layer to group
different components of the RA logically.
To thoroughly address SLR–RQ2, attention is directed towards the description of
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
143
these components, categorising them based on their functions. These categories include:
BD Management and Storage, Data Processing and Application Interfaces, and BD
Infrastructure.
exist. For data warehouses, relational databases often reduce flexibility for analysis,
potentially incurring substantial costs. Conversely, data lakes allow the storage of
varied data without the need for a pre-defined schema, enhancing flexibility. However,
excessive flexibility can lead to misuse, with engineers potentially cluttering the data
lake. Ensuring data governance and active metadata can mitigate such challenges
(Dehghani, 2019).
This synthesis suggests that BD RAs draw from three main paradigms: 1) the
enterprise data warehouse paradigm (including the modern data warehouses), 2) the
data lake paradigm, and 3) the multi-modal cloud-based paradigm. Some RAs offer
higher abstraction levels. In cases like S20, making assumptions about data pipeline
nature and storage modality remains challenging. Only the resulting system can offer
clarity regarding the paradigm to which the RA aligns.
Regarding BD management, certain cross-cutting concerns often go unnoticed.
Many BD RAs do not adequately address security, privacy, metadata management, and
data quality. Few RAs focus on security, like S13, while others emphasise metadata
management, such as S15. A comprehensive exploration of BD cross-cutting concerns
appears lacking.
BD systems often encompass two major data processing activities: stream processing
and batch processing. Stream processing suits sensitive operations and time-sensitive
tasks, like detecting fraudulent credit card activity. Batch processing suits extended data
analysis, such as regression analysis.
The processing type needed for a specific architecture relies on the data’s char-
acteristics, mainly its variety, volume, and velocity. Different studies offer varied
abstraction levels regarding data processing. Some studies, like S19, detail data pro-
cessing pipeline processes, while others, like S15, abstract them to batch processing’ or
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
145
stream processing’.
Moreover, two data processing categories emerge. One category employs separate
architectural constructs for batch and stream processing, while the other processes both
within a single architectural component.
BD interfaces are either presented solely as a ‘serving or access layer’ or as multiple
components tailored to different requirements. Some RAs, like s22, clearly delineate
ingress, egress, and inter-node interfaces, while others employ simplistic annotations.
BD Infrastructure
To address SLR–RQ3, RAS collected for this SLR are appraised to identify limitations.
This is summarised in Table 4.3.
RA Components
S2 This RA, designed specifically for healthcare and life sciences, does not consider
data quality, security, or privacy. It bears a resemblance to monolithic n-tier
architectures and relies on IBM specific solutions. There is no mention of data
accountability or interoperability.
RA Components
S7 This RA provides with bare minimum components for data analytics, without
any clear identification of stream processing. The RA is designed in a reduc-
tionist manner, without addressing privacy, security, metadata, and data quality.
Data storage seems to be only associated to hardware, and thus it is unclear how
data is evolved and scaled.
RA Components
S9 This BD RA, designed by Intel for healthcare applications, lacks detail and
doesn’t address cross-cutting concerns such as security, metadata, privacy, and
data quality. The concept of access manager seems to be vague, as I could
not understand how the access is managed. The artefact seems to be a simple
instance of Hadoop ecosystem with some extra components added
S11 This RA is made up of three major phases, ingestion, processing and presenta-
tion. It is designed around Hadoop ecosystem, and provides with bare minimum
necessary to conduct data analytics. I could not find a discussion on metadata
management, security, privacy or data quality. The data pipeline is using a data
warehouse, and uses it to communicate to the Hadoop side of things. I are not
sure how unstructured data is handled, and how data lineage is achieved.
S12 This architecture is specifically designed for waste management, and seems to
be using an approach similar to Kappa. The data takes a generic flow from data
sources to application, without any clear identification of data quality, privacy
and security concerns.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
149
RA Components
S13 This RA is specifically designed for the security domain and seems to have a lot
of inspiration from NIST BD RA. The RA is layed out in fabrics just like the
NIST one, and unlike many others, does mention cross-cutting concerns such
as security explicitly. However the concept of data ownership is not discussed,
there’s no mention of metadata or privacy, and the artefact evaluation is not
extensive. That is, it is unclear how the derived solutions from this RA can
scale.
S14 A generic RA that resembles the Lambda architecture, with stream and batch
processing being processed in different nodes. This RA utilises data warehouses
and Hadoop cluster for data processing. It was unclear how the security, privacy,
metadata, and data quality is achieved. Maintainability aspects are not discussed
as well.
S15 This RA extends the Lambda architecture by adding a semantic layer. It has a
great focus on handling metadata in a right manner, but it does not seem to have
any identification of other cross-cutting concerns such as privacy, security or
data quality. It also adopts the idea of separate batch and stream layers, which
can potentailly affect modifiability negatively and increase cost.
S16 This RA clearly segregates stream data from other data, and defines clear
interfaces for ingestion of different data types. It then passes the data directly to
a distributed storage (Hadoop’s HDFS), and retrieves it later for deduplication
and cleaning. While the RA seems to have addressed the minimum requirements
of data analytics, it does not seem to address cross-cutting concerns such as
metadata, security and privacy. It is also unclear how data quality is achieved.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
150
RA Components
S17 This is perhaps the most comprehensive BD RA found in this SLR, and has been
heavily funded by the government of the USA. While this RA is a good tool
to facilitate open discussion, design structures, requirements, and operations
inherent in BD, it is more of a high-level conceptual model of BD, rather than
an effective BD RA. Some of the limitations witnessed in this RA is in its brief
mention of metadata management (only discussed in lifecycle management),
unclear approaches to attain data quality and data ownership, and potential
monolithic coupling of components in BD application provider.
S18 This RA segregate data extraction and data loading phases and tend to adopt
the idea of distinct stream and batch processing layers. Nevertheless, I could
not find the identification of cross-cutting concerns such as privacy, security,
or metadata management. I could not understand how data quality is achieved.
There are also three storages designed, but it seems like all data will eventually
be stored in one giant storage. This can potentially make modifiability harder
and create a choke point.
S19 Derived from S17 (NIST BD RA), this RA is largely identical to S17 but
tailored specifically for emergency management. The limitations discussed for
S17 apply to this RA as well, so a separate explanation is not provided.
S20 This RA shares all the fundamental components with NIST BD RA, and seems to
be very similar. However, the phrase fabrics seems to be changed to multi-layer
functions. Therefore this RA, just like NIST is too abstract and leaves many
architectural decisions unknown such as data storage, data quality assurance,
and data ownership. It is unclear on how storage should be approached, and the
overall structure resembles to a monolithic data pipeline architecture.
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
151
RA Components
S21 One of the few RAs that tend to absorb the concept of microservices into BD
development. Nevertheless, the RA seems to be driven by the idea of one data
lake for all data storage, which can be a daunting task to scale and maintain.
The concept of metadata does not seem to be discussed, and other concerns
such as security, privacy, data quality and data provenance are unclear.
Except for one case (S22), all the architectures and RAs found as the result of this
study, were designed with an with an underlying monolithic data pipeline architecture,
with four major components being data consumers, data processing, data infrastructure,
and data providers. To discuss the integral facets that embroil these architectures, one
must look at the characteristics of these architectures and the ways in which they achieve
their ends.
Findings from this SLR and deep analysis of the RAs found highlight 3 generations
of BD architectures:
complicated task.
3. Cloud Based Solutions: Given the cost and complexity of running a data lake
on-premise alongside the whole data engineering pipeline, and the substantial
talent gap currently faced in the market (Rada et al., 2017b), the third gener-
ation of BD architectures tends to revolve around as-a-service or on-demand
cloud-based solutions. This generation of architecture tends to be leaning towards
stream-processing with architectures such as Kappa (J. Lin, 2017), or frameworks
that unify batch and stream processing such as Apache Beam (Foundation, 2021)
or Databricks (Inc., 2021). This is usually accompanied by cloud storage such
as Amazon S3, and streaming technologies such as Amazon Kinesis. Whereas
this generation tends to solve various issues regarding the complexity and cost
of data handling and digestion, it still suffers from the same fundamental ar-
chitectural challenges. It does not have clear data domains, a group of siloed
hyper-specialised data engineers are running them, and data storage through a
monolithic data pipelines soon becomes a choke-point.
1. Data Ingestion: Systems ingest data from various parts of the enterprise, encom-
passing transactional, operational, and external data. For example, in veterinary
practice management software, the platform can ingest and persist transactional
data such as interaction with therapeutics, number of animals diagnosed, and
quantity of invoices created and medicines dispensed.
2. Data Transformation: Data from the preceding step undergoes cleansing for
duplication, quality, and privacy policy considerations. This data then undergoes a
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
154
3. Data Serving: At this juncture, the data meets various needs, from machine
learning to marketing analytics, business intelligence, product analysis, and
customer journey optimisation. In the context of veterinary practice management
software, the platform can offer real-time data through event backbone systems
such as Kafka about customers who have procured and been dispensed restricted
veterinary medicine (RVM) to ensure these transactions meet the conditions of
the registration of these products.
data variability, and the overall workload for the data engineering team, thus
prolonging the data serving process.
Workload management outside of primary data teams can cause delays and foster
tribal knowledge. This siloed approach hinders cross-functional collaboration and
impedes data-driven decision making (Hechler, Weihrauch & Wu, 2023). The
separation between data producers and consumers creates bottlenecks in data flow
and utilization, often resulting in a lack of domain-specific knowledge within the
central data team (Company, 2019-2024).
data analytics approaches. This fragmentation can make it difficult to gain a holis-
tic view of data, as different systems may have incompatible formats, schemas, or
access protocols. As data volume grows, this lack of integration can become a
significant bottleneck, hindering the ability to efficiently process, analyse, and
derive insights from data. Additionally, managing and maintaining a complex in-
frastructure with multiple components can be costly and time-consuming, further
exacerbating the challenges of scaling and adapting to evolving business needs.
4.7 Discussion
In this section, a detailed summary of the findings from the SLR on BD RAs is provided,
highlighting the current landscape and examining the implications of these findings. The
research methodology has enabled an investigation into the state of the art in BD RAs,
revealing insights into their development, implementation, and maintenance. To the best
of available knowledge, this study represents the first comprehensive SLR of BD RAs
in the academic domain, addressing a gap in the literature. Despite the important role
RAs play in the efficient development and ongoing support of BD systems, the findings
indicate a notable shortfall in focused academic attention towards these artefacts. This
oversight underscores the need for further research and discussion in this area.
The study most closely related to this study in the domain of comparing and
analysing BD architectures is the one conducted by Volk et al. (2019). However,
their research does not focus on BD RAs but seeks to craft a decision support system
for selecting BD RAs. Its approach to BD architectures is somewhat superficial, not
aiming for a systematic collection.
Futhermore, the NIST BD RA (S17) researchers sourced a collection of white
papers from the BD Public Working Group as foundational material. These documents,
however, do not provide detailed BD RAs. Instead, they serve as conceptual proofs
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
160
This adherence to traditional methodologies raises concerns because it may not fully
accommodate the dynamic nature and scalability requirements of contemporary BD
environments. The reliance on monolithic paradigms could potentially hinder the ability
of organisations to effectively manage and derive insights from vast and varied data
sources, thereby limiting the agility and responsiveness needed in today’s data-driven
decision-making processes.
Neither the effort to integrate BD analytics into data warehouses nor the endeav-
our to bolster business intelligence using data lakes appears sustainable or scalable
(Dehghani, 2022). Consequently, the emphasis shifts to the necessity of upcoming
research trajectories focusing on decentralised and distributed BD RAs.
In the SLR conducted, significant gaps were identified in the current BD RAs that
are pertinent to their effectiveness in industrial applications. Specifically, limitations
were observed in architectures such as S1 (Lambda architecture) and S17 (NIST BD
interoperability framework), which are critical for the functionality of contemporary
data-driven systems. It was noted that S1 lacks a comprehensive approach to data archi-
tecture, failing to adequately address essential elements such as data quality, privacy, and
metadata management. Similarly, S17 is characterised by its broad conceptual model
for BD but exhibits deficiencies in addressing metadata management and the detailed
requirements for data quality and ownership. These findings, detailed in Section 4.6.7,
highlight the need for the development of BD RAs that more effectively address these
identified gaps, thus enhancing their applicability and relevance to the complexities of
current and future BD systems.
Subsequently, the analysis conducted in this SLR not only illuminated the limitations
within current BD RAs but also led to the identification of significant gaps in the domain,
particularly concerning the architectures’ ability to address specific failure modes in BD
projects. Such observations have precipitated the formulation of the research question:
How can a Reference Architecture be designed to mitigate or address these failure
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
162
Aligned with the protocols of the PRISMA and the research methods detailed in
Section 4.5, an evaluation of validity threats is important. This evaluation aims to
transparently address and articulate any potential biases or limitations encountered in
the execution of this study. The threats to the study’s validity, as identified, are presented
in the ensuing discussion
• Construct Validity: The selection of sources, search terms, and criteria for
inclusion and exclusion were designed with the intent to align closely with the
SLR’s objectives. Nonetheless, threats to construct validity may still be present,
emanating from possible subjective interpretations of study eligibility, the risk of
omitting pertinent studies due to the specificity of search terms, or the limitations
inherent in the databases searched. Efforts were made to meticulously craft the
search strategy to encompass the domain’s state of the art, yet the complete
exclusion of the possibility of overlooking relevant studies or misclassifying the
selected studies’ relevance cannot be guaranteed.
However, threats to internal validity might emerge from the variability in the
methodologies and quality of the included studies. The structured approach
to incorporating grey literature aimed to broaden the review’s scope, introduc-
ing challenges in evaluating the robustness and reliability of these sources in
comparison to peer-reviewed academic literature.
4.9 Conclusion
This chapter sought to find all BD RAs available in practice and academia. The findings
revealed an understanding that RAs can be an effective artefact to tackle complex
BD system development. RAs, while incorporating established patterns, represent
a comprehensive architectural framework that goes beyond just pattern collections.
They encompass architectural decisions, design principles, quality attributes, and their
complex interactions that together address a class of problems. The emergence of
desired behavior and quality attributes requires careful consideration of how these
elements work together, along with contextual factors and implementation details.
These artefacts direct attention to architectural requirements and solve many of the
Chapter 4. A Systematic Literature Review on Big Data Reference Architectures
164
5.4 Rationale for Pattern Selection Links to Chapter 6 (Design of the Artefact)
5.5 Discussion
165
Chapter 5. A Systematic Literature Review on Microservices Patterns 166
5.1 Introduction
Following the SLR on BD RAs in the previous chapter, this chapter transitions to focus
on microservices patterns. This dedicated SLR aims to address identified gaps and
discrepancies in the current academic understanding of microservices patterns.
Microservices patterns have gained significant traction in software engineering due
to their ability to tackle issues related to scalability, maintainability, and fault tolerance
in distributed systems (Ataei & Staegemann, 2023; Taibi, Lenarduzzi & Pahl, 2018).
These patterns provide proven solutions to common architectural problems, enabling
the design of loosely coupled, independently deployable, and highly resilient services.
Scalability is addressed by decomposing monolithic applications into smaller, in-
dependently scalable services. Maintainability is improved through modular and de-
coupled design, allowing for independent development, testing, and deployment of
services. Fault tolerance is achieved by isolating failures, preventing cascading effects,
and enabling graceful degradation.
Given Metamycelium’s distributed nature and its emphasis on domain-driven design,
leveraging microservices patterns can significantly enhance its architectural robustness
and adaptability. Adopting microservices patterns empowers Metamycelium to address
the challenges of scalability, maintainability, and fault tolerance inherent in distributed
BD systems. The alignment with domain-driven design principles further strengthens
the case for leveraging these patterns, enabling Metamycelium to deliver a modular,
flexible, and resilient BD architecture that can evolve and adapt to changing business
requirements.
While previous works have contributed to cataloguing microservices patterns, a
comprehensive and widely accepted collection remains elusive. One notable work
in this domain is "Microservices Patterns: With examples in Java" by Richardson
(2018). However, Richardson’s book, although comprehensive, does not provide an
Chapter 5. A Systematic Literature Review on Microservices Patterns 167
RQ1 What microservices patterns are recognised and described in academic literature?
RQ2 Which of these patterns can be applied or used for a domain-driven distributed
BD RA?
These research questions align with the overarching objective of this thesis: to
explore the architectural complexities and design considerations in building robust,
scalable, and efficient BD systems. By critically analysing microservices patterns and
their relevance to BD RAs, this chapter aims to contribute insights into how these
patterns can be leveraged to address the challenges of implementing scalable and
maintainable BD systems.
The exploration of microservices patterns is considered necessary because the
artefact of this study is domain-driven and distributed, and as a result, it can benefit from
advancements in the microservices domain. This aligns with the artefact development
methodology described in Section 2.9.4.
The methodology for conducting this SLR follows a systematic approach, similar
to the previous SLR in Chapter 4. The process includes clearly defined inclusion and
exclusion criteria, a comprehensive literature search strategy, and a structured evaluation
Chapter 5. A Systematic Literature Review on Microservices Patterns 168
and screening process. The findings from this SLR will be analysed and synthesised to
identify potential microservices pattern for the development of the artefact.
This chapter is structured as follows: Section 5.2 outlines the SLR Methodology,
Section 5.2.1 describes the Evaluation and Screening process, and Section 5.3 presents
the Findings and Analysis derived from the SLR. From there on, Section 5.4 provides
a rationale for the selection of patterns for the creation of the artefact. Section 5.5
provides a critical discussion on the current state of microservices patterns and how
they relate to this study. Lastly, the study concludes by summarising the findings in
Section 5.6.
5.2 Methodology
The methodology used in this SLR closely aligns with the research methodology
employed in the BD RAs SLR discussed in Chapter 4. However, there are a few minor
differences between the two approaches. This methodology follows 14 steps: 1) select
data sources, 2) develop a search strategy, 3) develop inclusion and exclusion criteria, 4)
develop the quality framework, 5) pool literature based on the search strategy, 6) remove
duplicates, 7) scan studies’ titles based on inclusion and exclusion criteria, 8) remove
studies based on publication types, 9) scan studies abstracts and titles based on inclusion
and exclusion criteria, 10) assess studies based on the quality framework (includes three
phases), 11) extract data from the remaining papers, 12) code the extracted data, 13)
create themes out of codes, 14) present the results.
The first difference between this SLR and the SLR discussed in Chapter 4 is the
set of keywords used to search the literature. These keywords used for this SLR are
portrayed in Table 5.1. The keywords used are selected to comprehensively identify
relevant literature on microservices patterns, architectures, and design strategies. The
key concepts targeted include:
Chapter 5. A Systematic Literature Review on Microservices Patterns 169
By combining these core microservices terms with the various patterns, architec-
tures, and design elements, the search strategy is aimed at retrieving a diverse set of
scholarly works that investigate, describe, or evaluate microservices patterns and related
architectural constructs. The rationale behind these search terms is to comprehensively
cover the existing research on microservices patterns and design practices, which are
the primary focus of this SLR.
In the initial phase, 1,196 papers are removed due to duplication and publication type.
The remaining 1868 papers are filtered by title to evaluate their relevance to the concepts
of microservices patterns or architectural constructs related to microservices. From the
result of this process, 1699 of the 1868 papers are excluded, leaving 169 items for the
next round.
In the second phase, the same approach is followed for abstracts. As a result, 138
papers are excluded. Subsequently, papers not written in English (despite having an
English abstract), published before 2012, or with fewer than six pages are removed. 23
papers are then selected for quality assessment against the quality framework. In the
next phase, a deeper probing is initiated by evaluating the remaining studies against the
quality framework. As a result of this process, 10 studies are yielded.
To further increase the comprehensiveness of the review process, following the
recommendation of Webster and Watson (2002), the initial keyword search is amended
with a forward and backward search. Here, for the identified ten papers are examined
by which papers they cite and which papers cite them.
While the backward search can simply be based on the reference lists given in the
papers, the forward search is less unequivocal, because there are several sources with
slightly varying information. To account for this, two different ones, namely Google
Scholar and ResearchGate are used. However, both searches yield no new results that
suffice the criteria applied in the initial search.
Instead, the 538 papers (combined for all papers and both sources, not accounting
for duplicates) found in the forward search comprise, inter alia, thesis works, preprints,
studies that are not directly related to microservices, papers that are too short and papers
that do not meet the quality criteria.
Regarding the backward search, most of the utilised conference papers and journal
Chapter 5. A Systematic Literature Review on Microservices Patterns 172
articles with a focus on microservices are already captured by the initial search, further
highlighting its comprehensiveness. In total for the ten papers, and not accounting
for duplicates, there are 16 new entries mentioning microservices in the title that are,
however, ultimately not relevant for the focus of this work. Therefore, the final set still
consists of the ten contributions shown in Table 5.2.
5.3 Findings
As a result of this SLR and to answer RQ1, the data synthesis presents 28 microservices
patterns. These patterns are categorised based on their primary function and the specific
architectural challenges they aim to solve. This categorisation scheme is inspired by
the work of Richardson (2022), who proposed a comprehensive pattern language for
Chapter 5. A Systematic Literature Review on Microservices Patterns 177
This SLR’s objective is not to describe each microservices pattern. Most of these patterns
are already defined in the works of Richardson (2018) and on Azure Architecture Center
official website (Azure Architecture Center, 2024). Therefore, only the patterns used
directly or indirectly in the creation of either design theories discussed in Section 6.4 or
the architectural constructs in Section 6.5 are explained. As a result of this, 10 patterns
are identified as following:
Chapter 5. A Systematic Literature Review on Microservices Patterns 179
1. API Gateway
2. Gateway Offloading
4. Competing Consumers
5. Circuit Breaker
6. Log Aggregation
8. Anti-Corruption Layer
systems” (Ataei & Staegemann, 2023), which provided valuable insights into the rele-
vance and applicability of microservices patterns specifically for BD engineering.These
patterns are identified through various methods employed in the respective studies, such
as systematic literature reviews (S1, S5, S7, S10), case studies (S2, S9), and empirical
analyses (S3, S4, S6, S8).
The microservices patterns are described using a specific format and template proposed
by Buschmann, Meunier, Rohnert, Sommerlad and Stal (2008) in their book “Pattern-
Oriented Software Architecture: A System of Patterns.” This template provides a
structured way to document and communicate software patterns effectively.
The pattern template typically includes several elements such as the context in
which the pattern is applicable, the problem or challenge the pattern aims to address
(often framed as a set of questions), the forces or constraints that influence the solution,
variations or alternative implementations of the pattern, examples or known uses of the
pattern, the resulting context after applying the pattern, and related patterns that may be
relevant.
However, for the purposes of this study, the focus is on presenting the core essence
of each pattern in a concise and accessible manner, rather than providing an exhaustive
documentation of all pattern elements. Therefore, the decision was made to omit certain
elements from the pattern template, such as ‘forces’, ‘variation’, ‘examples’, ‘resulting
context’, ‘related patterns’, ‘known uses’, and ‘example application’.
Instead, the patterns are presented in a more streamlined format, starting with a
description of the context in which the pattern is applicable. This context sets the stage
for understanding the problem or challenge that the pattern addresses. The challenges
are then framed as a series of questions, highlighting the key concerns and considerations
Chapter 5. A Systematic Literature Review on Microservices Patterns 181
that the pattern aims to resolve. Finally, the proposed solution is presented in the form of
the corresponding pattern, providing a high-level overview of how the pattern addresses
the challenges outlined in the context and questions.
The pattern API Gateway is used to inspire Ingress Gateway, Gateway Offloading has
also affected the design of both Ingress Gateway and the Egress Gateway, External
Configuration Store is implemented as Data Lichen, Competing Consumers and Circuit
Breaker have indirectly affected the design of Event Processing Interface, Log Aggre-
gation is implemented as the Telemetry Processor, CQRS has indirectly affected the
design of the Event Backbone and the Service Mesh, Anti-Corruption layer has indirectly
affected the design of the Egress Gateway and the Service Mesh, BFF has affected the
design of the Batch Processing and Stream Processing Controller, and lastly Pipes and
Filters affected the design of the communication in the Service Mesh. The mapping of
the patterns to architectural components of Metamycelium are displayed in Table 5.4.
Additionally, the architectural components mapped to the patterns discussed here are
explain in detail in Section 6.5.
In this section, the 10 microservices patterns that are identified as particularly relevant
and applicable to the design of the Metamycelium are described in detail. These patterns
are selected based on their potential to address common challenges and concerns in
BD engineering, such as scalability, resiliency, data management, and communication
between distributed components. As discussed hereinabove, these patterns are mapped
against the Metamycelium components, and their usage is further elaborated in the
explanation of the respective components in Section 6.5.
Chapter 5. A Systematic Literature Review on Microservices Patterns 182
The descriptions aim to provide a clear understanding of each pattern’s context, the
challenges it addresses, and the proposed solution it offers. By presenting these patterns
and their relevance to BD systems, this section lays the foundation for understanding
how they influenced and shaped the design decisions behind various components of the
Metamycelium architecture.
It’s important to note that while these patterns are well-established in the microser-
vices domain, their application and adaptation to the specific context of BD systems
may require unique considerations and tailoring. Therefore, the descriptions will also
highlight any specific nuances or interpretations relevant to the BD engineering domain.
Through these detailed pattern descriptions, readers will gain insights into some
of the rationale behind the architectural choices made in the Metamycelium RA and
how microservices patterns can be effectively leveraged to address the complexities
and challenges inherent in designing scalable, resilient, and efficient BD systems. This
Chapter 5. A Systematic Literature Review on Microservices Patterns 183
section is necessary to familiarise with the patterns that have directly and indirectly
affected the design of the Metamycelium artefact, providing the necessary background
and context for understanding the architectural decisions made.
API Gateway
For example, an agricultural frontend application may require access to data from
microservices dealing with crop management, livestock records, weather data,
soil analysis, and financial information. Similarly, a healthcare frontend may
need to aggregate data from patient records, laboratory tests, insurance claims,
and billing microservices to provide a comprehensive view.
Problem: How does the financial micro-frontend retrieve the data it needs from various
backends? Should it make separate Representational State Transfer (REST)
requests to different APIs and then combine the data representation required?
REST is an architectural style for distributed hypermedia systems, commonly
used in web services for its simplicity and statelessness. As microservices evolve
over time, how does this point-to-point approach adapt to changes? If the financial
microservice changes its API, how does it impact the data composition logic in
the frontend? How does the frontend get notified of new endpoints? How does it
handle authentication with each individual microservice?
Solution: The solution is to have one gateway that resolves different data necessary
for various micro-frontends (see Figure 5.1). The API gateway can act as a single
entry for all clients, handling version changes, reducing the network requests,
and addressing cross-cutting concerns. In addition, the API gateway can help
with load balancing. The gateway can either proxy/route requests to appropriate
services or it can fan out a request to multiple services. Underlying this approach,
the communication pattern is streamlined and micro-frontends are only required
to know about the gateway.
Gateway Offloading
Context: In the context of the SaaS practice management system, various microser-
vices possess shared features. These features necessitate maintenance, config-
uration, and management. Such features include token validation, feature flag
management, SSL certificate management, encryption, and environment variable
management.
Problem: How does one go about handling these shared features? Should each team
write their own feature for their own services? If a feature is updated, should each
team then update their own implementation? How do we ensure that these features
conform to the same interface and standards? If a new feature is added, should we
communicate with three different teams to update their implementation? What
happens if an implementation of one team does not respect the specification?
Solution: Common features and cross-cutting concerns can be offloaded into a gate-
way (see Figure 5.2). This includes but is not limited to: SSL termination,
certificate management, feature flag management, environment variables man-
agement, secret management, monitoring, logging configurations, throttling, and
protocol translation. This approach simplifies the development of services, and
improves the maintainability of the system. In addition, features that require
special skills (privacy and security) can be developed by experts and propagated
Chapter 5. A Systematic Literature Review on Microservices Patterns 186
to teams, eliminating the risk that non-expert developers may introduce. This
pattern also introduces more consistency, and standardised interfaces, which helps
with communication, agility and productivity of development teams.
Solution: Store all application configurations in an external store (see Figure 5.3).
This can include package versions, database credentials, network locations and
APIs. On startup, an application can request for the corresponding configuration
from the external configuration store.
Competing Consumers
many requests coming from various sources. In addition, the processing required
for different requests varies, and while some may be quite cheap, others might be
compute intensive.
Problem: Should only one consumer instance be responsible for incoming requests?
What happens if that consumer instance does not have the computing resources
available? What happens if that consumer instance fails? Relying on a single
consumer instance for handling incoming requests introduces vulnerabilities.
There is uncertainty about whether the single consumer has adequate computing
resources. Additionally, the potential failure of this lone consumer instance could
lead to a complete halt in processing requests.
Solution: A message queue system can be used to load balance requests to different
consuming services based on their availability (see Figure 5.4). In this case, a
group of consumer applications is created, which allows for timely processing of
incoming requests during peak time. This can be achieved either by a push model
Chapter 5. A Systematic Literature Review on Microservices Patterns 189
This increases the elasticity, availability and reliability of the system. The queue
can act as a buffer between the producer and consumer instance, and help with
minimising the impact of consumer service’s unavailability. The message can also
be enhanced with fault tolerant mechanisms in case of node failures. Furthermore,
scalability is improved as new data consumers can be dynamically added. For
instance, in Amazon Web Services (AWS), auto scaling groups can be set for
Elastic Compute Cloud (EC2) instances, which are virtual servers in the cloud.
Circuit Breaker
Problem: How does one handle the failing service? How should the failed service be
handled to avoid a ripple effect? Addressing service failures is crucial to prevent
a cascading impact on interconnected services. The challenge is to implement a
mechanism that effectively manages a failing service without causing disruptions
to the larger system.
Solution: An architect can employ the circuit breaker pattern. The circuit breaker
pattern prevents services from repeatedly calling the failing service. This allows
for the system to operate in spite of a failing node, which helps with saving CPU
cycles, improving availability, improving reliability and decreasing the chance of
faulty data. In addition, circuit breaker signals the fault resolution, which allows
system to get back to its default state.
In a common scenario, circuit breaker acts as a proxy between the source and
destination services, and monitors the destination service. If the number of failing
requests reaches a certain threshold, the circuit breaker trips, blocking subsequent
requests to the destination. The circuit breaker then probes the failing service to
identify its health. Once the service becomes healthy again, the circuit breaker
allows requests to be passed to the destination.
1. Closed: the default state, where the circuit breaker listens on the number of
incoming requests
passed, it is assumed that the service is healthy, and the circuit breaker
switches to closed state. If any requests fail, the circuit breaker assumes the
fault is still present, so it reverts back to open state
Log Aggregation
Problem: How to understand the root cause of an issue if it is spanning across multiple
services? Should one read the logs of one service, and then the logs of the other
and the next to try to make sense of the problem? Identifying the root cause of
issues in a multi-service environment presents a challenge. Without a unified
system for log analysis, tracing issues across interconnected services becomes
complex and time-consuming.
Solution: A centralised logging service can be implemented that retrieves logs from
different services and composes them together (see Figure 5.6. ). The developers
can then search and analyse these logs to make sense of the root cause. This elim-
inates the tedious task of going to each service, extracting logs and aggregating
them manually.
Chapter 5. A Systematic Literature Review on Microservices Patterns 192
Context: Suppose that a team is working on a data heavy service. This service needs
to scale and process a lot of data. Following the traditional approach, often
the same data model is used to query and update the database. Underlying this
approach, the read and write workloads both go to the same datastore.
Problem: How should the team optimise for read workloads? How should the team
optimise for the write workloads? Can the team optimise for both read and write
workloads? How does the team handle the missmatch between the read and write
representations of the data? How does the team ensure a certain performance
objective is met on read workloads?
The team faces a challenge in optimising data handling for both read and write
workloads. Striking a balance between these two operations can lead to discrep-
ancies in the representations of the data. Additionally, ensuring that specific
performance objectives are met for read workloads presents further complexities.
Chapter 5. A Systematic Literature Review on Microservices Patterns 193
Solution: Implement CQRS pattern to separate read and write workloads, using com-
mands to update the data and queries to read the data (see Figure 5.7). This is
usually achieved through a message queue asynchronously. Having the command
and query separated simplifies modeling, development, and maintenance of data
stores. In addition, the system will be able to support multiple denormalised
views that are optimised for a specific workload.
CQRS is commonly implemented in two distinct data stores. This allows for the
read database to optimise for read queries. For instance, it can store a materialised
view of the data, and avoid expensive joints or complex ORM mappings. The read
database can be a different type of data store. One might choose to use a graph
database such as Neo4J for relationship heavy datasets, or a NoSQL database such
as MongoDB for highly dynamic data. On the other hand, CQRS can potentially
increase complexity, introduce code-duplication and increase latency.
Anti-Corruption Layer
Context: Most services rely on some other services for data or functionality. Each
service has its own domain model. Some of these services can be external services,
some of these services can be internal legacy services, and some of them can be
bleeding edge services. For these services to interoperate, there is a need for a
standard interface, protocol, data model or APIs.
Chapter 5. A Systematic Literature Review on Microservices Patterns 194
Problem: How does one maintain access between legacy internal systems and bleeding
edge internal systems? How does one enable interoperability between legacy
internal services and external services? Should the bleeding edge service be
modified to account for legacy service’s interface or API? Should the internal
services support the API requirements of external services even if they are sub-
optimal? Should the semantics of legacy and external services be imposed to the
bleeding edge service? Should services be corrupted by the requirements of other
services?
Context: In a large scale system, a backend service needs to provide the necessary
APIs for various clients. A client can be the user’s browser, a mobile phone,
or an IoT device. As the number of clients grows, the traffic grows, and new
requirements emerge. As a result, the backend service needs to account for higher
level of abstraction to serve the requirements of different clients.
Problem: Should the backend service account for various clients? If the backend
service tries to account for all clients, how hard will it be to maintain this service?
Can a general-purpose highly abstract backend service be scaled and maintained
easily? If the web development team has a conflicting requirement with the
mobile development team, how does the backend service account for that? How
does the backend service provide optimised data for each client? How can the
backend service be optimised for various clients?
The challenge arises when determining whether a backend service should cater to
diverse clients. Accommodating every client type might complicate the service’s
maintainability. Striking a balance between a general-purpose, highly abstract
backend and the specific needs of web and mobile clients becomes crucial. Ad-
ditionally, optimising the service for each client type and managing conflicting
requirements between development teams further complicates the situation.
Solution: A dedicated backend that accounts for a specific client (frontend) can be cre-
ated. This introduces opportunities for optimising performance of each backend
to best match the needs of the frontend, without worrying much about introducing
side-effects to other frontends. In addition, the backend will be smaller, better
abstracted, less complex, and therefore easier to maintain and scale. Further-
more, this enables horizontal teams to work without side-effects and conflicting
Chapter 5. A Systematic Literature Review on Microservices Patterns 196
Problem: Should all these processes be performed in one monolithic module? How
flexible is that approach? In light of emerging requirements how can one maintain
and scale the monolithic module? Is that the right level of abstraction? Does
this approach provide with much opportunity to optimise or reuse parts of the
module?
The dilemma revolves around the use of a monolithic module for executing various
processes. Concerns arise regarding its flexibility, maintainability, scalability,
and the level of abstraction it provides. Moreover, there’s uncertainty about the
Chapter 5. A Systematic Literature Review on Microservices Patterns 197
Solution: Different processes can be broken down into their own components (filters),
each taking a single responsibility. This provides clean and modular components
that can be extended and modified with ease. This pattern is ubiquitous in Unix
like operating system; for example it is common for system engineers to pipe
the result of the command ‘ls’ (list) into the command ‘grep’ (global search
for regular expression) or command ‘sed’ (stream editor). By standardising the
interface for data input and output, these filters can be easily combined to create a
more powerful whole. Composition then becomes natural, and the maintainability
increases. This pattern is portrayed at Figure 5.10.
This section presents the rationale for choosing the 10 microservices patterns discussed
hereinabove. A clear mapping between the components, requirements, theories and
principles of Metamycelium and the chosen microservices patterns are made. While
the requirements, artefacts and the principles of the artefact are discussed later in
Chapter 6, this rational is deemed necessary as it justifies the need for this SLR.
Moreover, Metamycelium’s software and system requirements are mentioned in this
section with their alphanumeric identifier which are described in Section 6.3.
The selection of the 10 microservices patterns for the artefact was driven by their
ability to address key challenges and concerns in BD engineering, while aligning with
the architectural principles, styles, and design decisions outlined in Section 6.4.
Chapter 5. A Systematic Literature Review on Microservices Patterns 198
The API Gateway and Gateway Offloading patterns are chosen to facilitate efficient
communication and load balancing between components of the BD system and external
clients or data consumers within Metamycelium. These patterns directly influenced
the design of the Ingress Gateway and Egress Gateway components, serving as entry
and exit points. By consolidating cross-cutting concerns such as authentication, SSL
termination, and protocol translation at these gateways, Metamycelium promotes secure
and optimised data flow, addressing requirements like Vol-1, Var-1, Var-3, Var-4, Val-1,
Val-3, Val-4, SaP-1, and SaP-2.
The Competing Consumers and Circuit Breaker patterns are selected to enhance
fault tolerance, reliability, and resilience within Metamycelium’s distributed architec-
ture. These patterns indirectly influenced the design of the Event Processing Interface
component, which acts as an intermediary for handling events and managing service
communication. By enabling load balancing of requests across multiple consumer in-
stances and implementing circuit breaking mechanisms, Metamycelium aims to mitigate
the impact of service failures and ensure continued operation, addressing requirements
like Val-1 and Ver-1.
The Log Aggregation pattern is adopted to simplify troubleshooting and observabil-
ity across Metamycelium’s components. This pattern directly influenced the design
of the Telemetry Processor component, which centralises log collection and analysis
from various domains. By providing a unified view of logs and metrics, Metamycelium
facilitates root cause analysis and performance monitoring, indirectly addressing re-
quirements such as Vol-1, Vel-1, Val-1, and Ver-1.
The CQRS pattern is chosen to optimise data management and processing within
Metamycelium’s domains. By separating read and write workloads, this pattern influ-
enced the design of the Event Backbone and the Service Mesh components, enabling
domains to optimise their data stores and processing pipelines for specific workloads.
This pattern indirectly addresses requirements related to data processing, such as Vel-1,
Chapter 5. A Systematic Literature Review on Microservices Patterns 199
Vel-2, Vel-3, Vel-4, Vel-5, Val-1, Val-2, Ver-1, Ver-2, and Ver-3.
The Anti-Corruption Layer pattern is selected to facilitate the integration of legacy
systems and external services with Metamycelium’s components. This pattern indirectly
influenced the design of the Egress Gateway and the Service Mesh, enabling domains to
communicate with external entities while maintaining their independence and avoiding
compromises on their interfaces, design, and technological approaches.
The BFF pattern influenced the design of the Batch Processing and Stream Pro-
cessing Controller components, dedicated to handling batch and streaming events
respectively. By separating concerns and optimising processing pipelines for specific
workloads, these components address requirements like Vel-1, Val-1, Val-2, Vel-2,
Vel-4, and Vel-5.
Finally, the Pipes and Filters pattern is adopted to facilitate modular and com-
posable data processing pipelines within each domain’s Service Mesh. This pattern
promotes separation of concerns, code reusability, and maintainability, aligning with
Metamycelium’s principles of loose coupling and domain-driven decentralisation.
The selection of these 10 patterns is guided by the overarching architectural prin-
ciples and styles defined for Metamycelium, such as domain-driven decentralisation
(AS1, AR1, AR2, AP1), event-driven communication (AP8), federated governance
(AP5, AP6, AR6), and the treatment of data as a first-class citizen (AP4). While other
patterns may have been considered, the chosen set aimed to strike a balance between
addressing key challenges in BD engineering while adhering to Metamycelium’s archi-
tectural vision, promoting scalability, resilience, and maintainability within a distributed,
domain-driven landscape.
Chapter 5. A Systematic Literature Review on Microservices Patterns 200
5.5 Discussion
This SLR aimed to identify and analyse prevalent microservices patterns described
in academic literature and industry sources. Through a comprehensive search and
evaluation process, 28 distinct patterns that address various architectural challenges in
microservices-based systems are identified. These patterns are categorised into groups
based on their primary function and the specific architectural challenges they aim to
solve. The categorisation scheme was inspired by the work of Richardson (2018), who
proposed a comprehensive pattern language for microservices.
The adapted categorisation organised the patterns into five main categories: Data
Management, Platform and Infrastructure, Communicational, Transactional, Logical,
Fault Tolerance, and Observability. This categorisation facilitates understanding the
role of each pattern within the overall microservices architecture and the selection and
application of appropriate patterns based on the specific requirements and challenges of
a given system.
The identified patterns serve as a valuable resource for architects and developers
working on microservices-based BD systems like Metamycelium. For the Metamycelium
RA, which aims to enable the development of robust, scalable, and efficient BD systems,
10 specific patterns are deemed particularly relevant and described in detail. These
patterns include API Gateway, Gateway Offloading, External Configuration Store, Com-
peting Consumers, Circuit Breaker, Log Aggregation, CQRS, Anti-Corruption Layer,
Backend for Frontend, and Pipes and Filters. These patterns address critical concerns
such as efficient communication, resilience, data management, and integration within a
distributed BD architecture.
The mapping of these patterns to specific components of the Metamycelium archi-
tecture highlighted their potential applicability and relevance in addressing real-world
challenges in BD engineering. For instance, the API Gateway and Gateway Offloading
Chapter 5. A Systematic Literature Review on Microservices Patterns 201
patterns influenced the design of the Ingress Gateway and Egress Gateway components,
facilitating efficient and secure communication between microservices and external
clients within Metamycelium. The nuances and interpretations relevant to the BD
engineering domain are highlighted in the pattern descriptions, providing insights into
leveraging these patterns effectively in the context of scalable, distributed and resilient
BD architectures like Metamycelium.
This SLR contributes to the growing body of knowledge on microservices patterns
and their application in BD systems, specifically informing the design and development
of the Metamycelium RA. By consolidating and analysing relevant literature, a com-
prehensive overview of patterns that can guide architects and developers in designing
and implementing scalable, and maintainable microservices-based BD solutions like
Metamycelium is provided.
However, it is essential to acknowledge the limitations of this study. While the
search strategy aimed to be comprehensive, some relevant literature may have been
inadvertently excluded. Additionally, the selection and mapping of patterns to the
Metamycelium architecture components involved subjective judgments based on the
researchers’ expertise and understanding of the domain. The threats to validity discussed
in Section 4.8 apply to this SLR as well.
Future research could explore the practical implementation and evaluation of these
patterns within the Metamycelium RA and other real-world BD systems, investigating
their efficacy, trade-offs, and potential for integration with emerging technologies and
approaches in the field of BD engineering. Furthermore, as the microservices and BD
landscapes continue to evolve, regular updates and extensions to the identified pattern
catalogue may be necessary to ensure its relevance and comprehensiveness for the
development of Metamycelium and other BD RAs.
Chapter 5. A Systematic Literature Review on Microservices Patterns 202
5.6 Conclusion
6.4 Theory 1. Ingress Gateway Links to API Gateway pattern (Section 5.3.4, Chapter 5)
4. Event Processing Interface Links to Competing Consumers and Circuit Breaker patterns (Section 5.3.4, Chapter 5)
7. Product Domain Service Mesh Links to CQRS, Anti-Corruption Layer, Pipes and Filters patterns (Section 5.3.4, Chapter 5)
10. Telemetry Processor Links to Log Aggregation pattern (Section 5.3.4, Chapter 5)
14. Identity and Access Management Links to Externalised Configuration Store pattern (Section 5.3.4, Chapter 5)
15. Secret Management System Links to Externalised Configuration Store pattern (Section 5.3.4, Chapter 5)
203
Chapter 6. Design: The Interplay of Theory and Artefact 204
6.1 Introduction
In the preceding chapter, microservices patterns have been studied and their correlation
to the artefact are discussed. This chapter is dedicated to the design and development of
a treatment, here referred to as the artefact. Artefact refers to physical or digital objects
created during the design process, while theories are abstract concepts that guide the
design. The relationship between artefact and theory is iterative, with each informing
and shaping the other. DSR can help bridge the gap between theory and practice and
lead to innovative solutions.
This chapter is made up of the following integral elements: 1) discussion on potential
stakeholders (Section 6.2), 2) the requirements specified for the artefact (Section 6.3),
3) the theories that underpin the design and development of the artefact (Section 6.4),
and 4) the artefact itself (Section 6.5).
Stakeholders in DSR are individuals or groups who can influence or be affected by the
design, development, implementation, or use of the artefact being created (A. R. Hevner
et al., 2010). The identification of potential stakeholders for Metamycelium is based on
the analysis of industrial reports and surveys discussed in Section 2.9.1 of the research
methodology chapter.
The following roles are identified as potential stakeholders for Metamycelium:
1. Data Engineers: These professionals are tasked with designing, building, and
maintaining the data infrastructure that supports an organisation’s BD initiatives.
The introduction of a BD RA can influence data engineers in several ways, includ-
ing the need to familiarise with new tools and technologies and the requirement
to modify existing data pipelines to align with the new architecture.
Chapter 6. Design: The Interplay of Theory and Artefact 205
2. Data Architects: These individuals are responsible for designing the comprehen-
sive data architecture that meets an organisation’s business needs. The introduc-
tion of a BD RA can influence data architects, necessitating modifications to their
existing data architecture to fit the new reference model or designing new data
models to support the new structure.
3. Data Stewards: Data stewards have a crucial role in ensuring the accuracy,
completeness, and security of an organisation’s data. They collaborate with data
engineers and data architects to manage data in a way that meets the organisation’s
data governance policies. The adoption of a BD RA can impact data stewards
in several ways, including modifying existing data governance policies to align
with the new architecture or working closely with data engineers and architects to
ensure consistent data management.
4. Data Scientists: Data scientists are responsible for analysing and interpreting
data to gain insights into business operations, customer behaviour, and other key
areas. The adoption of a BD RA can impact data scientists in multiple ways,
including the need to learn new tools and technologies to work with the new
architecture, and the requirement to modify their existing data models to align
with the new architecture.
As the result of the processes conducted (see Section 2.9.3), and by carefully evalu-
ating similar approaches to requirement specification, a set of requirements for the
development of artefact is tailored. These requirements are presented in terms of BD
characteristics in Table 6.1. The definition of each BD characteristic is defined in
Section 3.8.
Chapter 6. Design: The Interplay of Theory and Artefact 207
Vel-1) System needs to support slow, bursty, and high throughput data
transmission between data sources
Vel-2) System needs to stream data to data consumers in a timely
manner
Vel-3) System needs to be able to ingest multiple, continuous,
Velocity
time-varying data streams
Vel-4) System shall support fast search from streaming and processed
data with high accuracy and relevancy
Vel-5) System should be able to process data in a real-time or near
real-time manner
SaP-1) System needs to protect and retain the privacy and security of
Security
sensitive data
&
SaP-2) System needs to have access control, and multi-level,
Privacy
policy-driven authentication on protected data and processing nodes.
6.4 Theory
While the tools and technologies of data engineering have reached a scale and diversity
reminiscent of the Cambrian explosion (Figure 6.1), the underlying assumptions that
govern the data architectures have not been much challenged.
According to Richards and Ford (2020) there are two major categories of architec-
tures; 1) monolithic (deployed as a single unit) and 2) distributed (the system is made
up of sub-components that are deployed separately). Today, most data architectures are
in the first category.
While starting as a monolith can be a simple and good approach for building a
data-intensive system, it falls short as the solution scales. While this assumption is
challenged in the software engineering world, data engineering seems to still be driven
by monolithic designs (Ataei & Litchfield, 2022). These designs are enforced by enabler
technologies such as data warehouses and data lakes. In addition, many organisations
and books adopt the idea of a single source of truth.
Chapter 6. Design: The Interplay of Theory and Artefact 211
Analytical data and operational data are two distinct types of data used in businesses.
Operational data is used to manage day-to-day business operations, while analytical
data is used to support strategic decision-making by identifying patterns and trends in
historical data.
Many of the challenges of current BD architectures rely on their fundamental
assumption of dividing operational and analytical data. While operational and analytical
data have different properties and are processed differently, bringing operational data
far from its original source can affect its integrity negatively, create organisational silos,
and produce data quality issues.
These two planes of data are usually operated underlying different organisational
hierarchies. Data scientists, business intelligence analysts, machine learning engineers,
data stewards, and data engineers are usually under the leadership of the Chief Data
and Analytics Officer (CDAO) and are heavily involved in creating business value out
of data. On the other hand, software engineers, product owners, and quality assurance
engineers are usually working with the Chief Technology Officer (CTO).
This has resulted in two segregated technology stacks and heavy investments in
bridging the two. This chasm has resulted in two different topologies and fragile
integration architectures through ETLs (Figure 6.2). This is usually achieved by some
sort of batch ETL job that aims to extract data from operational databases. These ETLs
usually do not have any clearly defined contracts with the operational database and
are sheer consumers of its data. This highlights the fragility of this architecture, as
the upstream changes from operational databases can affect downstream analytical
applications. Over time, ETL job complexity increases, maintainability becomes harder,
and data quality decreases.
Many of the technologies created over the years are developed underlying this very
Chapter 6. Design: The Interplay of Theory and Artefact 213
assumption. While these technologies are effective in handling the volume, velocity,
and variety of data, today’s data challenges are about the proliferation of origins, data
quality and data architecture.
These challenges include the proliferation of data origins, ensuring data quality
across diverse sources, and designing data architectures that can seamlessly integrate
and process data from various systems in real-time or near-real-time (Manyika et al.,
2011; Davenport, Barth & Bean, 2012).
Data are usually collected and consolidated through several systems. Some of these
data may even go beyond the perimeters of the organisation. Herefore, based on the
premises discussed hereinabove and the challenges explicated in Section 4.6.7, it is
posited that today’s data architectures need a shift from the centralisation of data in one
big analytical database to connecting analytical data wherever it is.
Based on this, the artefact designed for this study learns from past solutions and
addresses their shortcomings. This artefact aims to walk away from overly centralised
and inflexible data architectures that act as a coordination bottleneck. Therefore, one
of the objectives of this artefact is to bridge the gap between the point where data is
generated and the point in which it is used, thus simplifying the process. This artefact
Chapter 6. Design: The Interplay of Theory and Artefact 214
aims to increase agility in the face of growth and respond effectively to organisational
changes. Therefore the architectural style of Metamycelium is as follows:
Today’s businesses are dealing with a great deal of complexity. A typical business
is made up of various domains with different structures. These domains change at
different rates and tend to be quite isolated from each other. The overall synergy of the
business is dictated by the relationship between these domains and the ways in which
these domains evolve.
At the heart of these synergies sits volatility and rapid change in the market and an
ever-increasing number of regulations. How do businesses today manage the impact
of these changes to their data? Should they constantly modify ETL jobs, create new
ETL backlogs, and consolidate data into operational stores? How can businesses create
quality and trustworthy data without slowing down? The answer lies in embracing and
adapting to the changes in today’s data landscape.
One way to tackle this complexity is to align technology with business. Businesses
break down their problem into smaller problems that are handled in each domain;
technology and data can be incorporated into those domains too. This approach is
well-established in microservices architectures (Ataei & Staegemann, 2023). Therefore,
the first and foremost rule of Metamycelium is:
Domain-driven design
Architectural Principle 1 (AP1): The domain that generates the data, should own
that data.
agents. In one study, Reynolds (1987) analysed a synchronised flock of starling birds in
autumn. This study presented the fact that every starling bird follows three simple rules:
1) alignment (following flockmates that are close by), 2) separation (so birds do not
collide with each other), and 3) cohesion (keeping the same pace as the neighbouring
flockmates). These rules can be mathematically expressed as follows:
Alignment:
1 N
vi (t + 1) = vi (t) + ∑(vj (t) − vi (t)) (6.1)
k j=1
where vi (t) is the velocity vector of bird i at time t, k is a normalisation factor, and
N is the number of neighbouring birds.
Cohesion:
1
vi (t + 1) = vi (t) + (ci (t) − pi (t)) (6.2)
k
where ci (t) is the center of mass of the neighbouring birds, pi (t) is the position of
bird i at time t, and k is a normalisation factor.
Separation:
N
(pi (t) − pj (t))
vi (t + 1) = vi (t) + ∑ (6.3)
j=1 d2ij
where pi (t) is the position of bird i at time t, pj (t) is the position of bird j at time t,
and dij is the distance between birds i and j.
Starling birds do not need a centralised orchestrator to create this complex adaptive
system. In Metamycelium the aim is to promote a domain-driven distribution of data
ownership. This architecture is modeled in a way that a domain does not provide only
operational data through a standard interface, but does provide analytical data too. For
instance, in practice management software for veterinaries, the animal domain provides
operational APIs for updating animal attributes, but it can also provide analytical
interfaces for retrieving animal data within a window of time. Every domain owns its
Chapter 6. Design: The Interplay of Theory and Artefact 218
data.
In this fashion, the domain can also choose to retrieve data from other domains with
some sort of discovery mechanism, process the data and enrich its data. In some cases,
there can be a creation of aggregate domains with the main concern of aggregating data
from various domains and providing it for a specific use case.
This is to remove vertical dependency and allow teams to have their local autonomy
while being empowered with the right level of discovery and APIs. This architecture
promotes the idea of coequal nodes consolidated to achieve the overall goal of the
system rather than a centralised database of all data owned by people who do not
have domain knowledge. This concept is inspired by DDD in the book presented by
Evans (2004), data mesh (Dehghani, 2020), and microservices architecture (S. Newman,
2015b). Thus the next rule is as follows:
According to Ford, Parsons and Kua (2022), an architectural quantum is the smallest
unit of architecture that has high cohesion, includes all the structural elements to meet
its ends and is independently deployable. In the case of Metamycelium, the architectural
quantum is the data domain. This domain should encompass all the structural elements
required to achieve its functionality, should be able to have a separate lifecycle, and
should be able to evolve without introducing side effects to other domains.
The transition from monolithic n-tier architectures to SOA presented challenges,
notably in maintaining the Enterprise Service Bus (ESB), which became a bottleneck
due to the bloated transformation and logic operations. The emergence of microservices
architecture addressed these issues by promoting a shift from complex pipelines to
simpler ones, paired with smarter, self-contained services, often designed following
domain-driven principles (Ataei & Litchfield, 2023).
Although microservices bring their own complexities, they represent an evolutionary
step in software engineering akin to advancements in data engineering, where the focus
shifts from the tightly-coupled nature of traditional ETL processes to more agile and
decomposed approaches.
Metamycelium scales up by the addition of new data domains. That is, data domains
are the axis over which the system scales. This is in striking contrast with current
centralised architectures that chose their architectural quantum based on technology.
Based on that premise, the next principle is as follows:
Architectural Principle 2 (AP2): The architectural quantum should align with the
business, not technology.
To create an effective data product, it is important to include not only the data
itself but also the necessary infrastructure and systems to ensure that the data is easily
accessible, understandable, and secure. Additionally, the product should be designed to
Chapter 6. Design: The Interplay of Theory and Artefact 220
Data
Data is the core of what a data domain provides through a standard interface. This
implies that the domain owns and stands accountable for the lifecycle of its data.
Depending on the nature of the contexts and technology stacks, the data domain can
provide its data in various formats including columnar, tabular, and files. The domain
can even choose to provide its data through storage systems. The domain must adhere
to a set of data contracts and global policies.
Metadata
In addition, it plays a key role in maintaining the data, its quality and its lifecycle. For
instance, a data domain may include metadata about data semantics, data types, data
structure, statistical characteristics and service level objectives (SLOs).
As opposed to current BD architectures where metadata is usually centralised into
some sort of metadata management system, the data domain itself is responsible for
extracting, extrapolating and providing metadata.
Code
In this architectural approach, domains independently own their business logic, an-
alytical data, version history, and access management. They should own their code,
executing within their bounded context. Various code type, transformation code, API
code, and discovery and metadata handling code are included within a domain. Cur-
rently, BD architectures typically externalise this as a separate artefact known as a
data pipeline. Metamycelium dispenses with external pipelines, favoring internal data
transformation codes. Thus, the next architectural rule is:
Architectural Rule 3 (AR3): The domain should entail all the necessary structures
to provide its data. These structures are data, metadata, code, and configuration.
This transformation code aligns with the domain’s business logic and particularities.
The data model is shaped by the team’s objectives for analytical and transactional
data. The domain is responsible for data cleansing and quality, ensuring metrics
like timeliness, integrity, accuracy, completeness, and consistency. Thus, the next
architectural principle is:
Architectural Principle 3 (AP3): The domain should adhere to a set of data quality
metrics. These metrics should be set from the governance layer.
reflecting the system’s context and architecture. This approach allows data engineers and
developers to build against a stable contract, preventing negative impacts on downstream
consumers. Additionally, the domain should have its input data API, ensuring a clear
ingress for upstream data reads
In essence, the domain should maintain three fundamental APIs: 1) ingress APIs, 2)
egress APIs, and 3) metadata APIs. Figure 6.5 illustrates this concept.
Configuration
The domain should encapsulate codes that are responsible for executing and configuring
various structural and behavioural policies such as run-time environments, access con-
trol, compliance, encryption and privacy. Policies executed by the domain contain two
major facets 1) domain-driven policies, and 2) federated governance policies. Domain-
driven policies are the policies that the team generate based on the particularities of
its product. Federated governance policies are the high-level standard rules that come
from the federated computational governance layer. This layer is discussed in detail in
Section 6.4.5.
Chapter 6. Design: The Interplay of Theory and Artefact 223
• Data are treated as products, not projects: Data are made discoverable to
other teams across the organisation via a centralised data catalogue or metadata
repository, termed Data Lichen in Metamycelium, as discussed in Section 6.4.9.
This simplifies access to data products without navigating complex silos or relying
on central IT.
• Data products are aligned with business capabilities: Data products are aligned
Chapter 6. Design: The Interplay of Theory and Artefact 224
with business capabilities, rather than technical domains or systems. This means
that data product teams are responsible for understanding the needs of the business
and designing data products that support those needs.
This process can include the study of a data developer or data engineer’s journey
and how they interact with platform APIs. This relationship is depicted as an ‘x-as-
a-service’ facilitation relationship in Team Topologies presented by Skelton and Pais
(2019). Based on these premises the next architectural rule is as follows:
Architectural Rule 5 (AR5): The platform architecturally should not be overly
centralised and inflexible. This architecture should embrace the dynamics of today
and manifest itself as facilitating ‘as-a-service’ with a standard and easy-to-use
interface.
This platform architecture should be applied to two major areas: 1) the domains and
Chapter 6. Design: The Interplay of Theory and Artefact 227
2) the data lichen. Data lichen is a word devised to name an architectural component
of Metamycelium. Data lichen is a major part of Metamycelium as it is a platform in
which data engineers can get access to data products available from the company. It
serves as a discovery mechanism. It supports various operations such as searching for
data sets, traversing the lineage, and providing metadata and data quality metrics. Data
lichen relies on interfaces provided by the domains as it collates and aggregates them to
a standard view. This construct is described in Section 6.4.9.
The platform plane is inspired by the concept of affordances by Norman (2013). In
his works, Norman defines affordance as a relationship between the properties of an
object and the capabilities of an agent. Therefore, the platform should afford experiences
such as easy-to-use and standard platform interfaces, and discovery and addressability
of data products.
Project (OWASP) (Open Web Application Security Project, 2017), or compliance issues
with regulations like the GDPR, California Consumer Privacy Act (CCPA), or Personal
Data Protection Act (PDPA). Additionally, the absence of data architects can result in
flawed data models and the selection of suboptimal tools for data management.
Hence, Metamycelium advocates for a governance model that ensures data across the
organisation is managed with an emphasis on quality, security, consistency, compliance,
performance, privacy, and usability. This approach necessitates a shift away from
traditional data engineering practices, which often rely on a central canonical data
model and comprehensive processes for data validation and certification, roles typically
filled by data stewards and custodians. However, such centralised models can impede the
agility of an organisation, slowing down the time to insight and constraining experiment-
driven development, which are essential for maintaining a competitive edge in a rapidly
evolving market (Baum, Münch & Ramler, 2016).
Metamycelium embraces constant changes in the data landscape and does not aim
to kill the creativity and innovation of the teams. In this architecture, a representative
from each domain joins the governance group, crystalizing the concept of federated
governance. This results in the next architectural principle follow:
among all domains. Proxies are like the dots in the general patterns of communication.
These proxies mediate and control communication. In addition, they collect and report
telemetry data.
The next logical component is the control plane. The control plane is responsible
to leverage standards and policies that come from the governance layer. This includes
the configuration codes that have been discussed in Section 6.4.2. The control plane
is like a control tower that looks over processes and enforces policies through special
high-level leverage.
The concept of a service mesh is to provide a dedicated infrastructure layer for
managing services in the domain in an automated way. This layer is responsible
for handling cross-cutting concerns so data engineers do not have to rewrite privacy,
security and runtime mechanisms. Furthermore, this layer serves as a mechanism for
automatically applying global policies that have been generated from the federated
governance layer.
Service mesh increases visibility and control over domain-to-domain communica-
tion, make it easier to understand and diagnose issues, and allows domains to perform
autonomously without having to create backlog tickets for platform engineers. In turn,
this increases the security, privacy, resiliency and scalability of the Metamycelium. The
concept of service mesh, as depicted in Figure 6.7, originated with Istio, a joint project
by Google, IBM, and Lyft (Istio, 2018).
It is imperative for the policies to be standardised and consistent. This enables the
domain to avoid mismatching interfaces or getting into faulty states. Having a stan-
dard policy for access control, identity and encryption reduces complexity, increases
interoperability, reduces maintainability costs, and helps with scalability. Given that
Chapter 6. Design: The Interplay of Theory and Artefact 231
the standard chosen is an open standard, the system can easily integrate with other
third-party systems outside the perimeter of the organisation, and this in turn increases
the interoperability of the architecture.
For instance, authentication and authorization are two key components of any
distributed system. Nodes need to communicate with each other safely through a secure
protocol and port. This issue is further highlighted when the domain has to serve its
data to external consumers. An effective system must be able to confidently identify
users or systems whether internal or external.
If there’s no standard way of accessing and sharing data in Metamycelium, it would
be impossible to enable data sharing among domains. The more diversified the approach
to authentication and authorization, the higher the cost of maintenance, and chances of
friction. While this sounds intuitive and obvious, surprisingly, there’s no industry-wide
Chapter 6. Design: The Interplay of Theory and Artefact 232
adopted protocol for identity and access management for data management systems.
This has been hardly discussed in the RAs depicted in Chapter 4, Section 4.1 as well.
Based on this, the next architectural principle is as follows:
Just like identity and access control, the system should be governed by an array of
standards associated with encryption. These standards should include data in motion, in
transit, and at rest. For instance, the organisation may opt to use confidential computing
techniques such as Trusted Execution Environments (TEEs) and secure enclaves (Sabt,
Achemlal & Bouabdallah, 2015; Arnautov et al., 2016).
Taken all together, the next rule is as follows:
Architectural Rule 7 (AR7): There should be an ideally one and one standard way
of addressing cross-cutting concerns.
AR7 aligns with the principle of having lower entropy in software architecture.
Lower entropy refers to a state of order, consistency, and predictability within the
system.
The heavy network demands of distributed systems may further complicate matters,
potentially causing tail latency, context switching, and gridlocks (Sriraman & Wenisch,
2018; Gan et al., 2019; Kakivaya et al., 2018; Ataei & Litchfield, 2021b). This tight
coupling is at odds with the distributed system’s goals of autonomy and resilience.
To overcome these issues, Metamycelium adopts asynchronous event-driven commu-
nication. This model enables services to publish and respond to events, thus decoupling
their interactions. In this publish and forget framework, services announce events
to specific topics and move forward without awaiting direct responses. This is simi-
lar to restaurant staff responding to environmental cues instead of direct commands,
promoting a smooth operational flow.
While event-driven architectures typically offer eventual consistency, which might
not be suitable for certain real-time stream processing scenarios requiring immediate
consistency, it is a safe assumption that the majority of data engineering workloads can
efficiently operate within an event-driven paradigm . Based on this the next principle is
as follows:
Architectural Principle 9 (AP9): Services should avoid point-to-point communica-
tions if possible and instead opt for an asynchronous event-based and reactive style
of communication.
In Metamycelium, data is a first-class citizen. This implies that data should be treated
as a focal and essential component of the architecture. Data is increasingly dynamic,
comes in various forms, evolves rapidly and even changes semantics. Therefore to
address data in a maintainable way, some principles and architectural decisions should
be taken into consideration.
Each domain needs to serve its data as a product, adhering to the integral principles
that data must be discoverable, trustworthy, immutable, bitemporal, and accessible with
Chapter 6. Design: The Interplay of Theory and Artefact 235
read-only permissions. This grounds the next architectural rule for Metamycelium as
follows:
Architectural Rule 8 (AR8): All domains that serve data should make their data
discoverable through a standard interface.
Data discovery
Domain registration with Data Lichen is mandatory for participation in the data dis-
covery process, and this is typically integrated into the domain’s initialisation sequence.
This includes provisioning a globally accessible URI that references the domain’s
metadata, data, and operational status as described in Section 6.4.5.
To render its data semantically transparent, a domain translates its semantics into
standardised abstractions, which are then accessible in various formats, a process eluci-
dated in Section 6.4.11. Data Lichen acts as a directory, providing initial metadata to
consumers, who can then directly interface with the domain’s URI for further data ac-
cess, incorporating privacy safeguards like differential privacy when necessary (Dwork,
2006).
For exploratory and interpretive purposes, domains may offer computational note-
books to facilitate an interactive engagement with data, as outlined by (Wolfram Re-
search, Inc., 2021). For automated interactions, data semantics and format choices are
communicated through the domain’s API, which, upon request, results in the data being
stored in a distributed storage system. The domain then informs the consumer of the
data’s location for subsequent retrieval.
This decentralised storage model offers several advantages, such as reduced event
backbone load and the efficient utilisation of distributed storage systems for handling
large datasets, thereby enabling a more flexible and maintainable architecture (Fig-
ure 6.9).
However, this approach is not without limitations. To address diverse processing
requirements, Metamycelium endorses a stream-first approach to inter-domain commu-
nication, highlighting streams’ attributes of real-time processing, low latency, scalability,
and immutability, which are paramount for contemporary data-intensive operations, as
characterised by Urquhart (2020) in his discussions on streaming architectures.
These features render streams a pivotal element within Metamycelium, enabling
not only real-time data dissemination but also ensuring data persistence for historical
Chapter 6. Design: The Interplay of Theory and Artefact 237
Architectural Principle 10 (AP10): Favour streams as the main data structure for
inter-domain communication.
Architectural Rule 10 (AR10): All domains must comply with a prescribed set of
data quality benchmarks.
• Validity: The extent to which data is congruent with established standards and
criteria, affirming its intended utility.
• Last processing time: The juncture of the latest successful data procession.
• Timeliness: The degree of data recency and pertinence relative to its intended
application.
In conclusion, domains may elect to proffer custom metrics tailored to the data
consumer, such as the latest data access timestamp, deprecated fields, and datasets
currently under revision. The aforementioned metrics draw inspiration from two laud-
able W3C open-source standards devised for the standardisation of consistent data
metrics: 1) Data on the Web Best Practices: Data Quality Vocabulary (World Wide
Web Consortium, 2017) and 2) Data Catalog Vocabulary (DCAT) - Version 2 (World
Wide Web Consortium, 2014). These standards are endorsed and promulgated within
Metamycelium to catalyse a homogenised lexicon and best practices, fostering an open
data ecosystem on the web.
Chapter 6. Design: The Interplay of Theory and Artefact 240
Beyond the principles of data provision discussed in Section 6.4.8, the domain is
obliged to construct a coherent representation of its data, which involves conveying
the intrinsic semantics of the business elements. The choice of representation, be it an
entity-relationship model, a columnar model, or a document model, should align with
the domain’s business logic intricacies. Based on this the next principle is as follows:
Analytical and machine learning workloads necessitate the interconnection and consoli-
dation of data across domains. The prevalent BD architecture frameworks, as elucidated
in Chapter 4, typically employ explicit relationships for data composability. Typically,
this involves extracting data from transactional systems or data lakes, followed by
cleansing and modelling using dimensional schemes, such as star or snowflake schemas
(Kimball & Ross, 2013). However, such centralised methods of data composability
Chapter 6. Design: The Interplay of Theory and Artefact 241
prove suboptimal for decentralised architectures, as they can become fragile and make
schema evolution onerous.
In contrast, the Apollo GraphQL framework (Apollo GraphQL, 2023) facilitates data
composition through a distributed type system, utilising sub-graphs and super-graphs,
which are independent services with their own schemas and resolvers that can interlink
and extend across the network. Alternatively, data composability can also be achieved
using hyperlinks and centralised type systems like Schema.org (schema.org, 2011),
popular within the semantic web community, deploying linked data to interconnect
related datasets.
Metamycelium draws inspiration from these models, particularly adopting the
concept of a distributed type system, moving away from conventional master dataset
management. Domains within Metamycelium are self-contained, each defining their
schemas and lifecycle, which are addressable through a unique URI as part of the
metadata API, ensuring schemas are always accessible and current.
With well-defined domain boundaries, this system permits schemas such as those
for animals within a veterinary application to be uniquely addressable. These URIs
enable various domains to reference and extend schemas for their specific requirements.
Furthermore, schema metadata should include temporal aspects like ‘last processing
time’ and annotations for deprecated fields. Such a system is illustrated in Figure 6.10.
Immutability means that data once created, doesn’t change. This concept is fundamental
to functional programming and is widely discussed in academic literature (Haller &
Chapter 6. Design: The Interplay of Theory and Artefact 242
programming error. On the other hand, if there was a bug in the analysis query code, it
becomes difficult to track down what went wrong at that point in time.
This issue is exacerbated when it comes to a distributed BD architecture like
Metamycelium. If upstream data is used by several downstream data consumers,
and given that these downstream data consumers themselves become the provider of
data to other downstream data consumers and considering that each domain keeps a
slice of that data, the end service or the most downstream service may end up getting
two versions of the same data, results in mismatching truth.
Backtracking this data upstream is feasible, but undesirable as this would be a
tedious and time-consuming task. Therefore immutability is integral to Metamycelium
to provide two important guarantees: 1) once data is processed at a point in time, it will
never change and consumers can reliability repeat the data reads, 2) data that has been
read is kept consistent across different downstream consumers. Based on this then, the
next rule is as follows:
Architectural Rule 11 (AR11): Analytics data created from the domains for a given
point in time, should be immutable. That is, the data that has been produced for a
given point in time should never change.
The immutability of data reduces side effects, diminishes chances of incidental com-
plexity, makes debugging and issue finding easier, and increases the overall reliability
of the system. This concept is inspired by the principles of functional programming
(Bird & Wadler, 1988).
Bitemporality
date, indicating the actual occurrence of an event or fact, and the effective date, denoting
the time when the event or fact is considered true.
Bitemporal data modelling enhances the capacity for temporal analysis and prognos-
tication by enabling meticulous examination of historical data patterns and facilitating
predictions of forthcoming phenomena. Variability in update frequencies among distinct
domains may engender inconsistencies, thereby affecting the integrity of predictive mod-
els. Such variability can be effectively neutralised by instituting a uniform processing
time across data domains.
To illustrate, a data scientist might standardise the data selection process by extract-
ing records across all domains that have processing times within the delineated interval
from 2023-05-01 to 2023-05-02. Crucially, all domains must embed two temporal
properties into their datasets: actual time and processing time, as mandated by the
following architectural rule:
Architectural Rule 12 (AR12): All domains should associate two properties to their
data: 1) actual time, and 2) processing time.
Additionally, data processing should be scheduled at intervals that are suitable for
the context, allowing for data correction, quality enhancement, and minimisation of
data retractions. For instance, a financial application that processes a high volume of
transactions would necessitate a processing interval that is appropriately timed to match
the production and aggregation rate of the transactional data.
The architectural characteristics of the Metamycelium architecture are notable for their
focus on maintainability, scalability, fault tolerance, elasticity, and deployability. It
aligns with modern engineering practices such as automated deployment and conti-
nous integration. The architecture emphasises the use of microservices, which are
independently deployable and maintainable components.
Maintainability is a high-scoring characteristic of this architecture. The use of event-
driven microservices architecture allows for modular development and independent
scaling, making it easier to maintain and update individual components without affect-
ing the entire system. Additionally, the architecture supports automated deployment
practices, facilitating efficient updates and reducing manual intervention.
Scalability and elasticity are also prominent features. The architecture enables
horizontal scalability, allowing for the addition or removal of services based on demand.
This flexibility ensures that the system can handle varying workloads effectively.
Chapter 6. Design: The Interplay of Theory and Artefact 247
While the rating may not be entirely objective and depends on the researcher’s
interpretation, it is guided by well-established industry patterns and best practices. For
instance, event-driven architectures are commonly highlighted in "Patterns of Enterprise
Application Architecture" by Fowler (2012) as promoting high maintainability due
to loose coupling and asynchronous communication. Similarly, "Software Architec-
ture: The Hard Parts" by Ford, Richards, Sadalage and Dehghani (2021) discusses
how microservices architectures are often rated favorably for scalability due to their
modularity and ability to scale individual services independently. The book "Solutions
Architect’s Handbook" by Shrivastava, Srivastav, Sheth, Karmarkar and Arora (2022)
provided insights on evaluating maintainability based on factors like modularity of the
architectural constructs.
The evaluation of each characteristic was performed by thoroughly analysing the
Metamycelium architecture, its principles, rules, and components, as described in this
chapter. The rating for each characteristic was then determined by carefully considering
the guidelines, criteria, and examples provided in the aforementioned books, which
offer industry-recognised frameworks for evaluating software architecture qualities.
Any characteristics that aligned well with the recommended practices and exhibited
strong support in the Metamycelium architecture were assigned higher ratings, while
those with potential weaknesses or limited support received lower ratings.
Chapter 6. Design: The Interplay of Theory and Artefact 249
Characteristic Score
Maintainability ☀☀☀
Scalability ☀☀☀☀
Fault Tolerance ☀☀☀
Elasticity ☀☀☀☀
Deployability ☀☀☀☀
Cost ☀☀
Simplicity ☀
Performance ☀☀☀
Support for Modern Engineering Practices ☀☀☀☀
6.5 Artefact
After having discussed many kernel and design theories, the necessary theoretical
foundation is created for the design and development of the artefact. Metamycelium is
created with Archimate and displays the RA mostly in the technology layer. Displaying
these services in the technology layer means that it is up to the architect to decide what
flow and application should exist in each node. For the sake of completion, and as every
software is designed to account for a business need, a very simple BD business process
is assumed. While this business layer could vary in different contexts, Metamycelium
should be able to have the elasticity required to account for various business models.
It should be noted that the BD RA does not represent the architecture of any spe-
cific BD system. Instead, it serves as a versatile tool for describing, discussing, and
developing system-specific architectures using standardised principles. By offering
comprehensive and high-level perspectives, Metamycelium facilitates productive dis-
cussions regarding the requirements, structures, and operations inherent in BD systems.
Notably, it remains vendor-neutral, allowing flexibility in selecting products or services,
and does not impose rigid solutions that limit innovation.
Chapter 6. Design: The Interplay of Theory and Artefact 250
6.5.1 Metamycelium
Beyond load balancing, the ingress gateway offers several architectural advan-
tages. Firstly, it enhances security by preventing port proliferation and ensuring
that services are not directly accessed. Acting as a central entry point, it en-
ables fine-grained control and enforcement of security policies, such as SSL
termination, authentication, and potentially name-based virtual hosting. This con-
solidation of security measures at the ingress gateway safeguards the system from
Chapter 6. Design: The Interplay of Theory and Artefact 251
Additionally, the ingress gateway brings benefits in terms of monitoring and ob-
servability. With a clear point of entry, monitoring the incoming requests becomes
easier, and metrics, logging, and tracing can be focused on this central component.
This heightened visibility enables efficient troubleshooting, performance analysis,
and compliance monitoring.
processing controllers.
Having a specific controller for batch processing acknowledges the unique re-
quirements and characteristics of batch events, providing dedicated functionality
and optimisation. This component addresses the requirements Vel-1, Val-1, and
Val-2.
The stream processing controller, although also a small service, focuses on non-
heavy computations that are optimised for stream processing requirements. It
can enable stream provenance, which tracks the lineage and history of streaming
events, providing valuable insights for data governance and traceability. Addition-
ally, the stream processing controller can leverage one-pass algorithms, which
Chapter 6. Design: The Interplay of Theory and Artefact 255
The stream processing controller also simplifies monitoring and discovery within
the architecture. By having a separate component dedicated to stream processing,
it becomes easier to track and analyse the performance, latency, and throughput of
streaming events. Additionally, it enables focused monitoring of stream-specific
metrics, providing valuable insights into the behaviour and efficiency of the
streaming data pipeline.
its event-handling module. This can easily turn into a spaghetti of incompatible
implementations by various teams, and can even cause bugs and unexpected
behaviours.
Event brokers can also account for more dynamism by learning which events
should be routed to which consumer applications. Moreover, event brokers do
also implement circuit breaking, which means if the service they have to break
to is not available and does not respond for a certain amount of time, the broker
establishes unavailability of the service to the rest of the services, so no further
requests come through. This is essential to preventing a ripple effect over the
whole system if one system fails. This component is driven by the principles of
Competing Consumers and Circuit Breaker patterns discussed in Section 5.3.4
and 5.3.4 and indirectly addresses the requirements Val-1, and Ver-1.
Here, each service (dancer) listens and reacts to the event backbone (music)
and takes the required action. This means services are only responsible for
dispatching events in a dispatch and forget model, and subscribe to the topics that
are necessary to achieve their ends. Event backbone thus ensures a continuous
flow of data among services so that all systems are in the correct state at all
times. Event backbone can be used to mix several streams of events, cache events,
archive events, and other manipulation of events, so long as it is not too smart! or
does not become an ESB of SOA architectures.
domain’s analytical service, a control tower (e.g., Istio), and integration with a
federated governance service’s API for policy enforcement through sidecars.
The effectiveness of the service mesh stems from its architectural design and its
ability to address critical requirements. By encapsulating the domain’s capabili-
ties within a service mesh, the coupling between teams is eliminated, allowing
for enhanced team autonomy. This architectural approach empowers individuals
across teams by granting them the computational resources, tools, and autonomy
necessary to operate independently and scale without being negatively affected
by other teams or encountering friction with platform teams or siloed data engi-
neering teams.
using its API to retrieve and enforce policies through sidecars. This ensures
adherence to governance and compliance requirements, supporting centralised
control and management across the service mesh.
By enabling direct access to operational data within the service mesh, the analyti-
cal service gains real-time insights and a holistic view of the system’s operations.
This seamless integration of operational data eliminates the need for manual data
extraction and transformation processes, reducing latency and enabling timely
decision-making.
Moreover, having native access to operational data enhances the analytical ser-
vice’s effectiveness and accuracy. It eliminates potential data discrepancies or
inconsistencies that may arise when analytics are performed on ETLed data. By
accessing the operational data directly, the analytical service can provide up-to-
date and reliable insights, contributing to more accurate analysis and informed
decision-making.
This native and accessible access to operational data within the service mesh
brings analytics closer to the source. It promotes a data-driven culture by em-
powering the analytical service to work in tandem with operational systems,
enabling a feedback loop where insights from analytics can drive optimisations
and improvements in real time. This closer integration fosters a more agile and re-
sponsive approach to decision-making, leading to enhanced operational efficiency,
innovation, and business value. This also eliminates accidental data quality issues
Chapter 6. Design: The Interplay of Theory and Artefact 261
By eliminating the divide between analytics and operational data, the service
mesh architecture supports a unified view of the system, facilitating seamless
collaboration between analytics and operations teams. This convergence promotes
a holistic understanding of the business and operational dynamics, enabling data-
driven insights to be readily incorporated into the operational workflows.
The service mesh’s effectiveness lies in its ability to address key architectural
concerns. It promotes scalability, allowing the domain to handle large volumes
of data and increasing computational resources as needed (Vol-1). It facilitates
rapid development and deployment of analytical capabilities (Vel-3, Vel-4, Vel-5).
The service mesh architecture accommodates variability in business contexts,
supporting the diverse needs and requirements of different product domains (Var-
1, Var-2, Var-3). It ensures data validation, quality, and integrity by leveraging
advanced analytics and processing techniques (Val-1, Val-2, Val-3, Val-4). Se-
curity and privacy requirements are fulfilled through policy enforcement, secure
communication, and data governance mechanisms (Sap-1, SaP-2). Finally, the
service mesh architecture allows for the verification of system behaviour, enabling
efficient testing, monitoring, and verification of the domain’s analytical outputs
(Ver-1, Ver-2, Ver-3). This component design is affected by the patterns CQRS
(Section 5.3.4), Anti-Corruption Layer (Section 5.3.4), and Pipes and Filters
(Section 5.3.4).
standardise these services. This will facilitate the interoperability between ser-
vices, communication, and aggregates, and even allows for a smoother exchange
of members across teams. This also means the most experienced people at a
company such as technical leads and lead architects will prevent potential pitfalls
that more novice engineers may fall into. However, the aim of this service is not
to centralise control in any way, as that would be going a step backwards into the
data warehouse era.
This service aims to allow autonomous flow in the river of standards and policies
that tend to protect the company from external harm. For instance, failing to
comply with GDPR while operating in Europe can set forth fines of up to 10
million euros, and this may not be something that novice data engineers or
application developers are fully aware of (Voigt & Von dem Bussche, 2017).
The real challenge of the governance team is then to figure out the necessary
abstraction of the standards to the governance layer and the level of autonomy
given to the teams. The federated governance service is made up of various
components such as global policies, metadata elements and formats, standards
and security regulations. These components are briefly discussed below;
(a) Global Policies: general policy that governs the organisational practice.
This could be influenced by internal and external factors. For instance,
complying with GDPR could be a company’s policy and should be governed
through the federated governance service.
(c) Standards: overall standards for APIs (for instance Open API), versioning
(for instance SemVer), interpolation, documentation (for instance Swagger),
data formats, languages supported, tools supported, technologies that are
accepted and others.
9. Data Lichen: As the number of products increases, more data become available
to be served to consumers, interoperability increases, and maintenance becomes
more challenging. If then, there is no automatic way for various teams to have
access to the data they desire, a rather coupled and slow BD culture will evolve. To
avoid these challenges and to increase discoverability, collaboration, and guided
navigation, the Data Lichen should be implemented. Data discovery mechanisms
like Data lichen are listed as a must-have by Gartner (Ehtisham Zaidi, 2019)
and introduce better communication dynamics, easier data serve by services
and intelligent collaboration between services. This component addresses the
requirements Vel-4, Var-1, Var-3, and Var-4. This component is portrayed in
Figure 6.13.
10. Telemetry Processor: If all services employ the idea of localised logging, and
simply generate and store logs in their own respective environments, debugging,
issue finding and maintenance can become a challenging task. This is due to the
Chapter 6. Design: The Interplay of Theory and Artefact 264
Moreover, the centralised service provides a unified view of logs, metrics, and
traces, enabling comprehensive analysis and correlation of data across the entire
system. This allows for holistic monitoring, troubleshooting, and performance
Chapter 6. Design: The Interplay of Theory and Artefact 265
11. Event Archive: As the number of services grows, the topics in the event backbone
increase, and the number of events surges. Along the lines of these events, there
could be a failure, resulting in a timeout and a loss of a series of events. This
brings the system in the wrong state and can have a detrimental ripple effect on all
services. Metamycelium tends to handle these failures by using an event archive.
The event archive as the name states, is responsible for registering events, so they
can be retrieved in the time of failure.
to have its own kind of data storage. This is to prevent duplication, contrasting
data storage approaches, decreased operability among services and lack of unified
data storage mechanisms. The distributed storage service has been designed to
store large volumes of data in raw format before it can get accessed for analytics
and other purposes.
This means data can be first stored in the distributed storage service with cor-
responding domain ownership before it needs to be accessed and consumed by
various services. Structured, semi-structured, unstructured and pseudo-structured
data can be stored in the distributed storage service before it gets retrieved for
batch and stream processing. Nevertheless, this does not imply that all data
should directly go to the this service; the flow of data is determined based on the
particularities of the context in which the system is embodied. This component
addresses the requirements Vol-2, Vel-1, Var-1, Var-3, Var-4, Val-3.
One of the key architectural values of this PaaS component is its ability to abstract
the underlying infrastructure complexities. By providing a standardised API,
it allows each component to independently manage and provision the required
resources, such as compute, storage, and networking, without being burdened
by the intricate details of the underlying infrastructure. This abstraction layer
promotes loose coupling between components and facilitates easier development,
deployment, and maintenance of the system as a whole.
Each data domain can make requests to the PaaS API to provision, configure,
and manage the necessary resources, enabling them to operate independently and
efficiently. This decentralisation of infrastructure management enhances agility
and flexibility within the architecture. This component addresses SaP-1, SaP-2,
Var-1, Var-3, Var-4, Vel-1, and Vol-2.
14. Identity and Access Management: The Identity and Access Management (IAM)
component role is in ensuring secure and controlled access to the system’s re-
sources and data. It encompasses various architectural values that are essential
for maintaining data integrity, privacy, and regulatory compliance.
One of the key architectural values of the IAM component is its focus on authen-
tication and authorization. It provides robust mechanisms to authenticate users,
components, and services within the architecture, ensuring that only authorised
entities can access the resources and perform specific actions. These mechanisms
help prevent unauthorised access, mitigate security risks, and safeguard sensitive
data.
The IAM component ensures detailed access control and privilege management,
allowing for the establishment of specific access policies. It supports robust
authentication via standard protocols like OAuth and SAML, streamlining user
access with SSO. Additionally, it underpins auditing by logging access events,
thereby bolstering compliance, security oversight, and incident management.
This component addresses the requirements SaP-1, and SaP-2. A comprehensive
depiction of the component’s functionality and its interaction with other system
parts is illustrated in Figure 6.15.
15. Secret Management System: The central secret management system serves as
an important component for securely storing and managing sensitive information
such as passwords, API keys, cryptographic keys, and other secrets.
One of the key architectural values of the central secret management system
is its focus on the secure storage and encryption of secrets. It employs robust
encryption algorithms and mechanisms to protect sensitive data at rest, ensuring
that secrets are securely stored and inaccessible to unauthorised entities. This
value helps prevent unauthorised access to secrets, mitigates the risk of data
breaches, and ensures confidentiality.
Additionally, the secret management system supports the secure distribution and
retrieval of secrets to authorised components or services. It provides mechanisms
such as secure APIs or client libraries that enable secure retrieval of secrets during
runtime. This value ensures that sensitive information is only accessible to the
authorised entities that require them, preventing unauthorised exposure of secrets.
The variable elements in Metamycelium can be adjusted, modified and even omitted
based on the architect’s decision and the particularities of the context. The aim of
this RA is not to limit the creativity of data architects but to facilitate their decision-
making process, through the introduction of well-known patterns and best practices
from different schools of thought. All alternative options for each variable module
are not elaborated as the industry constantly changes, and architects constantly aim to
design systems that address the emerging problem domains.
For instance, an architect may choose to omit IAM from the implementation as
the company is not yet ready to invest in such system. The architect may choose to
implement authentication per service as the company has not yet scaled to be fully
domain-driven.
6.6 Conclusion
The chapter defines the requirements of Metamycelium based on big data characteristics
such as volume, velocity, variety, value, security, privacy, and veracity. It also discusses
the underlying theories of Metamycelium, including the limitations of monolithic
architectures, the need for domain-driven decentralisation, treating data as a first-class
citizen, and the importance of federated computational governance. Additionally, the
chapter presents architectural styles, principles, and rules that guide the design of
Metamycelium, such as adopting microservices architecture, promoting local autonomy,
and adhering to data quality metrics. Finally, the artefact, Metamycelium, is delineated
Chapter 6. Design: The Interplay of Theory and Artefact 272
with its 15 main components and 5 variable components, each serving specific roles in
the overall architecture.
The next chapter will focus on evaluating the Metamycelium artefact through a case
mechanism experiment. This evaluation will assess how well the artefact addresses the
defined requirements based on big data characteristics.
Chapter 7
Chapter 7: Evaluation - Case Mechanism Experiment 7.3 Prototyping an Architecture Links to Section 6 (Design)
7.1 Introduction
After presenting the design of the artefact in the previous chapter, the thesis now
transitions into the evaluation phase, which is essential for assessing the artefact’s
effectiveness and validity. It should be recognised that the validation of treatment plays
a critical role in its research, development, and implementation process. The validation
273
Chapter 7. Evaluation - Case Mechanism Experiment 274
The technologies chosen for each component of Metamycelium are based on the tech-
nology selection research conducted in Section 2.9.5 of the Chapter 2. The evaluation
matrix presented in Table 7.1 summarises the results of the technology selection process.
Evaluation Matrix
Chapter 7. Evaluation - Case Mechanism Experiment 276
Evaluation Matrix
Chapter 7. Evaluation - Case Mechanism Experiment 277
Based on the result of the evaluation matrix, technologies that provide the func-
tionalities needed for Metamycelium components are chosen. The preference is for
open-source technologies. The chosen technologies for each component are displayed
in Table 7.2.
Prototyping involves two distinct aspects: first, the development of a concrete architec-
ture that describes how the reference architecture can be applied to a specific context,
and second, the instantiation of this concrete architecture into an implemented system.
The concrete architecture serves as a blueprint that can support multiple system variants,
versions, and deployments. Its effectiveness is measured by how well it guides the
development, deployment, and operation of these various instantiations over time. This
is often called the concrete architecture (Wieringa, 2014). This process allows for
the validation and refinement of the architecture, enabling researchers to assess its
feasibility, performance, and alignment with the research objectives.
Prototyping the architecture involves implementing the key components and func-
tionalities outlined in the RA, Metamycelium. The chosen technologies, which are
academically justified and aligned with the research objectives, form the foundation
for building the prototype. Through the systematic implementation of the architectural
components, the architecture’s effectiveness in handling large-scale data processing,
addressing volume, velocity, variety, value, security, privacy, and veracity requirements
can be assessed.
This prototyping consists of three major phases: formation, amalgamation, and
scenario testing. The formation phase focuses on creating the foundational services
and components of the architecture using the chosen technologies. Next, the amalga-
mation phase integrates these services and components, establishing connections and
communication channels to ensure they work together as a cohesive system. Finally, the
scenario testing phase puts the architecture’s capabilities to the test by executing a set
of scenarios that simulate real-world use cases and edge cases, covering various aspects
such as data ingestion, processing, query execution, and security. The results of these
scenarios are then analysed and presented to evaluate the architecture’s performance,
Chapter 7. Evaluation - Case Mechanism Experiment 279
scalability, and alignment with the research objectives. These phases are portrayed in
Figure 7.1.
All of the components of this prototype are embodied in one Kubernetes cluster. Thus,
for this prototype to work, KIND (Kubernetes Special Interest Group, 2023) has been
chosen as the local Kubernetes cluster.
Of particular importance, is the repeatability of this process. For this purpose, all
the scripts necessary to run, download, and configure artefacts are stored in the scripts
folder in the Github repo at Polyhistor (2023b). It is important to note that these scripts
are written specifically to run on Unix-like operating systems.
After having set up the basic scripts for bringing up the cluster, the process began by
creating different services inside the Kubernetes cluster. For Nginx ingress and Kafka,
Helm Charts on Artifact Hub (Team, 2023) are used for increased maintainability and
scalibility. Terraform was then used to apply the charts on the Kubernetes cluster. This
can be found in IaaC folder of the repository.
After conducting a thorough search for Helm charts for all components of the
artefact, we chose a mature and well-regarded chart from Bitnami. This approach is
Chapter 7. Evaluation - Case Mechanism Experiment 280
preferred over creating special directives for Nginx ingress in Kubernetes and map-
ping services to each other using Kubernetes architectural constructs like Services or
StatefulSets.
After the ingress, the search continued on Artifact Hub for charts for other compo-
nents of the concrete architecture. After having found a suitable chart for Nginx ingress,
Keyclock, and Vault, the telemetry processor is created.
After that, the FastAPI application is packaged into a special format called a Docker
image. This image contains everything the application needs to run, including its code
and any necessary libraries. Onwards, this image is served on Github pages and then
used in Terraform helm release resources to be applied to the local KIND cluster. From
there on, the Kubernetes ingress controller is set as Nginx and applied to local kind
cluster using Terraform.
After having the control setup, the ingress resources are made to point the traffic
to Kafka-rest-proxy. Kafka-rest-proxy is added to the artefact to ease the process of
accessing data from the cluster. This is due to the fact that Kafka uses binary protocol
as oppose to HTTP while most agents today rely on HTTP for network communication.
After having the Nginx ingress correctly configured, and the kafka rest proxy,
Keycloak is installed and integrated into different elements of the architecture. From
there on, Vault is added to the cluster using Hashicorp’s official helm chart (HashiCorp,
2023b). Vault UI is activated but with challenges discussed in Section 7.4.
After this, the next big system to implement is Data Lichen. Since Data Lichen
needed a newly built frontend from scratch, embdedded Javascript which is a tem-
plate engine for Node.js is chosen. This template engine was chosen as it eases the
development and accelerates the processes of this experiment.
The objective is not to develop a comprehensive frontend for Data Lichen, but rather
to construct a straightforward user interface that effectively demonstrates the potential
appearance and functionality of this architectural framework. Different teams may
Chapter 7. Evaluation - Case Mechanism Experiment 282
choose to implement Data Lichen in different ways. Data Lichen needs to be coded
from scratch as there’s no open source technology that provides the functionalities
required.
After having Data Lichen sorted and deployed to the cluster, Istio service meshes
are created. For this purpose different Kubernetes namespace are created and different
services are grouped under the same namespace for service mesh to govern and operate.
Additionally, for a better development experience and observability, Istio’s dashboard,
Kiali, is installed.
Next, Minio chart is installed, and the dashboard is activated. From there on,
the final piece that had to be installed was Open Policy Agent (OPA). There are
several approaches to integrating OPA into an architecture. This could be done through
standalone Envoy proxy, Kubernetes operator, a load balancer through Gloo Ede and
finally through Istio.
Istio is chosen as it is already selected and deployed to the cluster. However, the
process of automatically injecting policies into services running in a specific service
mesh is not a trivial task. An approach taken for this experiment is the extension of
the Envoy sidecar with an external authoriser API to allow for inbound and outbound
policies.
This requires a deep knowledge of Envoy, Kubernetes, Istio, Rego ( the policy
language used by OPA ), and GoLang. OPA is integrated into the service mesh as a
separate service (also known as sidecar proxy). The OPA acts as an admission controller,
reviewing requests made to the Kubernetes API server.
OPA evaluates these requests against set policy rules. If a request is approved, it is
processed normally by the API server; if denied, an error response is returned to the
user. In terms of services and pods in a namespace, all interactions are also regulated by
OPA’s policies. In essence, OPA helps maintain compliance with predefined security
and operational rules in a Kubernetes environment. This is illustrated in Figure 7.3.
Chapter 7. Evaluation - Case Mechanism Experiment 283
Figure 7.3: Open Policy Agent Communication Flow with Other Services
Chapter 7. Evaluation - Case Mechanism Experiment 284
The first stage of the prototype coding and configuration is about the creation of the
foundation services in the cluster. This stage is about connecting them and making sure
that the architecture responds to external stimuli in an effective way. To do this, a series
of scenarios are chosen to be applied against the system.
The use of scenarios in the evaluation of new RAs has become a ubiquitous practice,
primarily because they provide a powerful mechanism to simulate real-world use-cases,
enabling the evaluation of the architecture’s performance under various conditions
(Kazman et al., 1998b, 1994; Len Bass, 2021).
This is particularly critical for this experiment, as these scenarios emulate different
behaviours, system loads, data volumes, and types of processing, providing insights
into the architecture’s resiliency, flexibility, scalability, and interoperability (Len Bass,
2021).
Inspired by the templates discussed in the works of Carroll (1995) and by the
examples discussed by Kazman et al. (1998a), a series of scenarios to test the prototype’s
capabilities are defined. These scenarios, each addressing a different aspect of system
performance and security, are detailed in Tables 7.3 through 7.8.
In this phase, following the definition of scenarios, the system is initialised and each
scenario is actively executed against it. For every specific scenario, the flow and its
application to the system are discussed, focusing on how the system manages each set
Chapter 7. Evaluation - Case Mechanism Experiment 286
of conditions. This approach involves bringing up the system and then systematically
running each scenario against it.
By doing so, key metrics are captured, and a deeper understanding of the system’s
response mechanisms to various stimuli is obtained. This method tests the system’s re-
silience and performance and offers insights into its operational dynamics and potential
areas for enhancement
regards to each domain. Data Lichen renders this information in a custom coded table
as per the Figure 7.4. The use of a unique identifier for each service ensures that the
listing and display of services are consistent and reliable, providing the same results
with each operation. This flow is communicated in Figure 7.5
This scenario follows the flow of domain A requesting a 5GB JSON file from
domain B. Domain A owns the weather data, while domain B owns the customer
ratings and business data. For this purposes, first an event is dispatched to Data Lichen
from domain to get a list of all datasets available with metadata including actual-time,
processing-time, completeness and accuracy.
This metadata includes the address of the data domain to be retireved. This address
is then used to dispatch an event to the domain B to retrieve the data. This triggers the
internal mechanism of the domain B which results in the creation and storage of the
datasets in the distributed storage service. Once this process is complete, an event is
dispatched to the data processing completion topic, which domain A is subscribed to.
An overview of Kafka topics used in this experiment is portrayed in Figure 7.6.
The event that communicates the completion of the data processing includes the
address of the data as well. Domain A uses this address to fetch the data. Once data is
fetched, then it is processed. This flow is depicted in Figure 7.7.
For this scenario, understanding system performance and resource utilisation is
pivotal. Data Ingestion Rate provides a direct measure of the system’s throughput,
indicating how efficiently the system can absorb and manage vast amounts of incoming
Chapter 7. Evaluation - Case Mechanism Experiment 291
data. A high rate signifies optimal data handling capabilities, essential for scenarios
where rapid data inflow is expected. Latency, broken down into ingestion and processing
components, offers insights into system responsiveness. Low latency ensures timely
data availability and processing, which is crucial for systems where delays can have
downstream impacts.
Moreover, CPU utilisation and memory utilisation are fundamental indicators of
the system’s resource efficiency. High CPU or memory consumption may indicate
bottlenecks, inefficiencies, or potential areas of optimisation, especially during peak
loads. Lastly, the error rate offers insights into the system’s robustness and reliability. A
low error rate, even under high data volumes, suggests that the system can consistently
handle and process data without faltering.
Chapter 7. Evaluation - Case Mechanism Experiment 293
Together, these metrics holistically assess the system’s capability to maintain perfor-
mance and stability under intense data ingestion demands. In the following sections
each metric is described in relation to the mechanism that has been triggered in the
system.
Data Ingestion Rate: To measure the data ingestion rate, each service is instrumented
using OpenTelemetry SDKs to automatically capture telemetry data for FastAPI requests
and manual spans for specific operations, with the collected data being exported to
Kafka via the KafkaRESTProxyExporter class.
Figure 7.8 portrays the open telemetry Kafka topic and an extensive amount of
events registered to this topic. All the events dispatched to this topic are consumed by
the open telemetry processor service. This service collects and updates metrics which
are then scraped by Prometheus and eventually transferred to Grafana. This flow is
communicated in Figure 7.9.
The JSON displayed in Figure 7.10 depicts an example of a trace span JSON that is
communicated over Kafka to the open telemetry service. Providing this example helps
Chapter 7. Evaluation - Case Mechanism Experiment 294
illustrate the specific data format and fields used to represent and propagate distributed
traces across the various services, which is a key mechanism for collecting the telemetry
data underlying the metrics analysed in this scenario.
P Ô⇒ Q (7.1)
Q Ô⇒ R (7.2)
P Ô⇒ R (7.3)
This logical sequence implies that every instance of data transfer from Domain
A to Domain B guarantees a subsequent confirmation of its successful storage. Such
a relationship underscores the system’s efficiency and reliability in managing large
volumes of data, resonating with the findings of K. Lee and Wang (2020) who empha-
sised the significance of ensuring data consistency and integrity in distributed storage
mechanisms.
As illustrated in Figure 7.11, the Kafka data ingestion rate peaked at 1,693,816,791
bytes at 2023-09-04 20:40:00 in customer domain’s analytical service, indicating the
flow of data from the operational service to the analytical service. This bar chart also
communicates the gradual increase in the load placed on the service. Furthermore, the
total data ingested over the experiment duration amounted to 1.6 billion bytes, as per
the Figure 7.12.
In the same vein, data ingestion rate for the weather domain’s analytical service is
portrayed in Figure 7.13 peaking at 1,693,893,416 bytes at 2023-09-05 18:00:00. This
service has ingested a total of 1.6 billion bytes.
Chapter 7. Evaluation - Case Mechanism Experiment 297
Figure 7.11: Kafka Data Ingestion in Bytes in Customer Domain’s Analytical Service
Figure 7.12: Total Data Ingested in Bytes in Customer Domains’ Analytical Service
Figure 7.13: Total Data Ingested in Bytes in Weather Domain’s Analytical Service
In order to provide an empirical basis for the system’s performance during high
data ingestion, the metrics as shown in the screenshots are visualised in Grafana. This
monitoring system provides time-stamped data points that indicate the amount of data
ingested at various intervals. The Table 7.9 captures the data points over a specific time
Chapter 7. Evaluation - Case Mechanism Experiment 298
span in customer domain’s analytical service. Moreover, the Table 7.10 presents the
data in weather domain’s analytical service.
After capturing the ingestion rate in bytes, it is essential to further explore other
metrics to provide a comprehensive analysis of the system’s performance during high
data load conditions.
Ingestion Latency: Ingestion latency captures the delay from the instant a data record
is published to Kafka to the moment it is processed by the system. Precisely, it is
determined for each consumed Kafka record using the formula:
Where:
• publish_time represents the timestamp when the record was published to Kafka.
If the Kafka record lacks a timestamp, the current time is assigned as a default,
making the latency calculation a fallback measure.
Chapter 7. Evaluation - Case Mechanism Experiment 299
This latency metric is presented in Figure 7.14. According to this figure the latency
is negilible in both domain’s analytical services, with one spike to 400 seconds and
the rest pretty stable ingestion latency around 50 to 100 seconds. The weather domain
shows even less latency in comparison to the customer latency at an average of 50
seconds. This is due to the fact this domain consumes less data.
data systems leveraging Kafka, as reported in industry studies and benchmarks. For
instance, the Kafka documentation from Confluent suggests that latencies under 10
milliseconds are considered excellent, while latencies up to a few seconds are reasonable
for most use cases (Confluent, 2024).
Similarly, a study by Kreps (2023) on benchmarking Apache Kafka found median
latencies ranging from a few milliseconds to over a second, depending on the cluster
size and configuration. Sporadic latency spikes can be attributed to Kafka’s intricate
underlying mechanisms, such as indexing, offset management, and partition assignment
strategies, as well as potential influences from the operating system and environment.
Despite these inherent challenges, the predominantly low and stable latency measure-
ments demonstrate the architecture’s capability to maintain reasonable responsiveness
while handling high-velocity, high-volume data ingestion across multiple domains.
CPU Usage: The telemetry processing service, central to the infrastructure, provides
vital insights when monitored for CPU utilisation. Monitoring this service is impor-
tant as it offers direct insights into data processing efficiency, where elevated CPU
consumption could signal potential processing bottlenecks.
Additionally, since CPU utilisation metrics are determined over specific intervals,
and given the asynchronous receipt of telemetry data from various services, tracking the
CPU usage of individual services might not accurately represent the real-time system
load. However, the telemetry service, due to its consistent data processing role, ensures
that its metrics are indicative of the actual strains on the system.
This observation becomes crucial when evaluating scalability and predicting future
infrastructure requirements. Importantly, it must be noted that CPU usage metrics can
be influenced by the operating system and environment in which the services operate.
Thus, the reported values might not be absolute but serve as a credible relative measure
of system resource consumption.
Chapter 7. Evaluation - Case Mechanism Experiment 301
Figure 7.15 illustrates the CPU utilisation trends of the telemetry processing service,
emphasising the interplay between data intake rates and resource consumption.
In the observed metrics from the Open Telemetry service, it is evident that the
CPU usage remains relatively consistent at around 24 percent, with intermittent peaks
observed, pushing the utilisation to 25.8 percent at specific intervals. This behavior can
be attributed to the underlying architecture, where events are continuously streamed
to the connected services. To further elucidate this point, a detailed tabulation of
the CPU utilisation over a specific period is presented in Table 7.11. This tabulation
provides insights into the fluctuation in CPU usage across different times and conditions,
revealing periods of consistent CPU usage, instances of inactivity, and occasional spikes
in utilisation.
The steady CPU utilisation averaging around 24 percent with occasional peaks up to
25.8 percent supports the assertion of a stable processing environment for the telemetry
service. As outlined in the CPU utilisation study by Gregg (2014), sustained high CPU
usage above 80-90 percent typically indicates resource saturation and potential perfor-
mance issues. In contrast, the moderate utilisation observed here suggests sufficient
resources to handle the incoming event load without significant computational strain.
Morover, the Kubernetes cluster CPU usage (Figure 7.16) displays a consistent
number of 43% with occasional hikes up to 250%. These hikes are usually observed
when a lot of data is transferred through the network and into the Minio.
Chapter 7. Evaluation - Case Mechanism Experiment 302
Along the same lines, as portrated in Figure 7.17, an average 5.37GB of data
read and 12.2GB of data write has taken place in different intervals. Additionally,
11.6GB of data has been received over the network, and 13.9MB of data has been sent
(Figure 7.18).
From a holistic examination of the telemetry and infrastructure metrics presented, it
Chapter 7. Evaluation - Case Mechanism Experiment 303
becomes evident that the asynchronicity and event-driven architecture of our BD system
play a pivotal role in its performance dynamics. Given this asynchronous event-driven
paradigm, it is of importance to recognise that the sheer volume, measured in gigabytes,
of data processed becomes a lesser concern compared to how efficiently the system
handles incoming data streams.
Chapter 7. Evaluation - Case Mechanism Experiment 304
The consistently moderate CPU usage of the telemetry service, even amidst varying
data intake rates, underscores its robustness in data processing. Similarly, while the
Kubernetes cluster exhibits occasional spikes in CPU usage during data-intensive
operations, it largely maintains a balanced load. This is a testament to the system’s
adeptness in managing and distributing tasks across its nodes.
The data statistics on read, write, and network transfers further explain this point.
While the gigabytes of data read, written, or transferred might seem significant, what
stands out is not the absolute quantity but the system’s efficacy in handling these
operations. An average of 5.37GB read and 12.2GB written, juxtaposed with the
network statistics presented in Figure 7.18, indicates the system’s proficiency in data
management and network operations, without being overwhelmed.
The metrics not only highlight the stability and efficiency of our BD infrastructure
but also provide empirical evidence that, in an event-driven asynchronous system
like Metamycelium, it is not just about the volume of data processed but more about
the efficiency and stability with which it is handled. This insight suggests potential
implications for scalability considerations in similar architectures going forward.
due to their less computationally intensive nature. In contrast, analytical services may
show variable memory patterns depending on the complexity of data analysis.
The customer domain follows a similar trend, with operational services displaying
stable memory usage due to serving primarily static customer data. Meanwhile, the
analytical services, which dive deeper into customer behaviours and preferences, might
see periodic spikes, especially when handling sizable datasets.
All these observations can be visualised in Figure 7.19, which showcases the
memory utilisation trends across the various services. The address of each service is
stated in the corresponding service definitions in the Github respoistories, but it is worth
mentioning that the service on port 8000 is the operational service in the customer
domain’s operational service, while the service on port 8001 is the customer domain’s
analytical service. The service on port 8005 is the weather domain’s operational service,
while the service on port 8006 is the weather domain’s analytical service. Finally, the
service on port 8008 is the telemetry processing application.
processing service. This alignment elucidates the system’s responsive scaling capa-
bilities in periods of intensive computational demand. Similarly, instances of reduced
CPU usage are mirrored by a lower memory footprint, underlining efficient resource
management during less intensive processing periods. Such adaptive memory utilisation
is critical to maintaining system performance and ensuring the seamless management
of real-time data across the infrastructure.
Table 7.12: Correlated Memory and CPU Usage for Various Services.
Date & Time Svc A (B) Svc B (B) Svc C (B) Svc D (B) Svc E (B)
09/06 00:00 H (12,750,000,000) H (12,800,000,000) L (4,700,000,000) H (14,000,000,000) H (13,500,000,000)
09/06 12:00 L (6,350,000,000) L (6,400,000,000) L (4,650,000,000) M (10,850,000,000) M (10,650,000,000)
09/07 00:00 L (6,375,000,000) L (6,425,000,000) L (4,675,000,000) M (10,870,000,000) M (10,670,000,000)
09/07 12:00 L (6,360,000,000) L (6,410,000,000) L (4,680,000,000) M (10,860,000,000) M (10,660,000,000)
09/08 00:00 M (9,560,000,000) M (9,610,000,000) L (4,720,000,000) H (13,800,000,000) H (13,300,000,000)
09/08 12:00 M (9,580,000,000) M (9,630,000,000) L (4,750,000,000) H (13,850,000,000) H (13,350,000,000)
09/09 00:00 H (12,780,000,000) H (12,830,000,000) L (4,770,000,000) H (14,100,000,000) H (13,600,000,000)
09/09 12:00 H (12,760,000,000) H (12,810,000,000) L (4,760,000,000) H (14,050,000,000) H (13,550,000,000)
09/10 00:00 M (9,570,000,000) M (9,620,000,000) L (4,730,000,000) M (10,900,000,000) M (10,700,000,000)
09/10 12:00 M (9,590,000,000) M (9,640,000,000) L (4,740,000,000) M (10,920,000,000) M (10,680,000,000)
09/11 00:00 M (9,565,000,000) M (9,615,000,000) L (4,710,000,000) M (10,880,000,000) M (10,670,000,000)
09/11 12:00 H (12,770,000,000) H (12,820,000,000) L (4,775,000,000) H (14,110,000,000) H (13,610,000,000)
09/12 00:00 M (9,585,000,000) M (9,635,000,000) L (4,745,000,000) M (10,930,000,000) M (10,690,000,000)
09/12 12:00 M (9,600,000,000) M (9,650,000,000) L (4,755,000,000) M (10,950,000,000) M (10,700,000,000)
Legend: B - Bytes, H - High Usage, M - Medium Usage, L - Low Usage. Service A corresponds to localhost:8000, B to
localhost:8001, C to localhost:8005, D to localhost:8006, and E to localhost:8008.
Error Rate: Within the FAST API services, error rates are determined by carefully
examining logs and by tracking exception blocks. Each domain has a dedicated Kafka
topic allocated solely for error reporting. Yet, as illustrated in Figure 7.20, only
the OpenTelemetry service reported errors. The Weather and Customer domains,
contrastingly, remained error-free.
The inherent resilience of event-driven systems significantly contributes to this low
error rate. These systems’ characteristic decoupling between components ensures that
transient failures in one service are often localised, preventing them from cascading into
system-wide disruptions. This behaviour is evident in the prototype, which underscores
the intrinsic benefits of event-driven architectures in maintaining system robustness.
With the above metrics in conjunction with the ingestion rate, a comprehensive
Chapter 7. Evaluation - Case Mechanism Experiment 307
Scenario S2 tests the system’s capabilities under high data velocity conditions, ap-
proximating peak ingestion rates. This experiment involved Domain A (Customer
Domain) streaming data into Domain B (Weather Domain), with Domain B subse-
quently processing the streamed data. To facilitate this, a dedicated Kafka topic, named
customer-domain-stream-data, is established, earmarking it exclusively for the stream-
ing operations between the two domains.
Concurrently, within the customer domain, a novel endpoint is created. This end-
point is responsible for streaming out the data that had been previously stored. In
synchrony with this, the weather domain analytical service adopts a long-polling mech-
anism. By doing so, it could efficiently subscribe to and intake the continuous data
stream relayed from the customer domain.
An analysis is conducted based on the metrics of interest:
Volume and Size of Messages: The system ingested an impressive total of 771,305
messages. Cumulatively, these messages totaled approximately 1 GB in volume, all
Chapter 7. Evaluation - Case Mechanism Experiment 308
Memory and CPU Utilisation: Memory footprint during this high-velocity data
scenario is important. Continuous monitoring of memory usage was instrumental in
gauging system health and efficiency. Memory consumption in the customer domain
peaked noticeably, registering 12,852,592,640 bytes. This is attributed to the inherent
complexity of the chunking logic employed in the streaming operation (Figure 7.22).
Contrarily, the weather domain, which acts as the data consumer, reported more modest
Chapter 7. Evaluation - Case Mechanism Experiment 309
memory usage. This is inferred to be a consequence of its simpler data processing logic,
which is devoid of the overheads of chunking (Figure 7.23).
Moreover, the CPU usage is only 5% increased in comparison to the stats stated in
Section 7.3.3 for both the Kubernetes cluster and corresponding services, therefore, no
extra figure is provided for this metric.
Processing Duration and Ingestion Latency: The ingestion latency, which is the
delay from when a data packet is received to the point it is ingested for processing,
remained consistent over the monitored period. The latency consistently measures
0.0000148 seconds at different points in time during the monitoring period. This
observation indicates a lack of significant processing delays in the system’s data inges-
tion mechanism, although a comparative analysis with other platforms’ performance
benchmarks would provide a more comprehensive evaluation (Figure 7.24). This con-
sistent latency suggests the system’s capability to sustain its performance levels without
disruption, particularly during instances of high data traffic.
Moreover, processing duration gives us insights into the time taken by the system
Chapter 7. Evaluation - Case Mechanism Experiment 310
to process the ingested data. The Weather Domain showcases a consistent processing
time of 1,694,491,558 nanoseconds (or approximately 1.6945 seconds) across different
timestamps (Figure 7.25). This uniformity indicates a stable data processing mechanism,
unaffected by potential variables during the streaming process.
The evaluation of the High Velocity Data Ingestion Scenario (S2) indicated that
the system consistently managed high-velocity data ingestion within the observed
parameters, reflecting its capacity to handle such scenarios effectively. Through the
creation of a dedicated Kafka topic and the strategic implementation of a new endpoint in
the customer domain to facilitate data streaming, peak data ingestion rates are effectively
Chapter 7. Evaluation - Case Mechanism Experiment 311
simulated. The system’s memory usage patterns showcase its capability to manage
substantial data flows. Specifically, the customer domain, driven by its chunking logic,
showcases a higher memory usage compared to the Weather domain. Nevertheless,
both domains effectively handled the processing and ingestion of a substantial total of
771,305 messages (approximately 1GB in size).
Further, the consistent ingestion latency and processing duration underscore the
system’s robustness. In conclusion, the system demonstrates a reliable capacity to
ingest, store, and process a continuous stream of high-velocity data without significant
performance degradation, meeting the expected outcomes of Scenario S2. In addition,
system has been constantly monitored to ensure that the allocated memoery for stream-
ing is freed upon the completion of the transmission. This monitoring yielded no sign
of memory leak and all memeory has been freed upon the completion of streaming.
Given the comprehensive results obtained from scenarios S1 and S2, which encom-
pass both high-volume and high-velocity data ingestion processes, it is deemed that
scenario S3’s focus on data variety is inherently addressed and validated, thus obviating
the need for a separate testing phase for scenario S3.
Next, scenarios S4, S5, and S6 are collectively assessed. The decision to test these
scenarios concurrently stems from their shared system processes and the capability
Chapter 7. Evaluation - Case Mechanism Experiment 312
1. Query Processing Time (QPT): Measures the duration from when a complex
query is initiated to when its results are returned. It encapsulates computational
Chapter 7. Evaluation - Case Mechanism Experiment 313
2. Secret Retrieval Latency (SRL): Captures the time taken to retrieve secrets. It
summarises the system’s efficiency in secret management.
3. Data Integrity Verification (DIV): Assesses the integrity of data during trans-
mission and processing, ensuring data security.
?
DIV = Hash(DataSent) = Hash(DataReceived) (7.7)
After retrieving the domains data through an authenticated approach using OpenID
Chapter 7. Evaluation - Case Mechanism Experiment 314
Connect and Vault on multiple occasions and across different days, the subsequent
findings emerges:
Secret Retrieval Duration: The observed metrics showed for majority a duration of 3
seconds across timestamps. This indicates a reliable and predictable secret management
system as portrayed in Figure 7.27.
(2017) emphasises the importance of data consistency and network resilience, crucial in
modern distributed frameworks. The prototype’s ability to handle concurrent updates
and maintain data integrity highlights the efficacy of this architectural approach in
distributed systems development.
Infrastructure Challenges
1. The confluent Helm charts could not be used as they utilised an alpha version of a
Kubernetes resource named PodResourceBudget. This resource was only in alpha
until V1.22, while this experiment operated on version 1.27. Nginx Ingress also
tried using the cloud manager to create a load balancer, resulting in the external
IP of the ingress service perpetually pending. This occurred because the cluster
was launched in a local dev environment, hence, no cloud service was available
to deploy a load balancer like EKS or AKS.
2. Another infrastructure challenge arose with the Kafka pod, which repeatedly
failed. Given Kafka’s heavy reliance on file storage for various types, like
segment files and log files, the originally allocated 64GB for the Kafka instance
proved insufficient. The configuration was subsequently adjusted, enhancing the
Chapter 7. Evaluation - Case Mechanism Experiment 317
3. During development, Prometheus and Grafana required a host address for inter-
communication. This was also true for the telemetry processing service and the
Prometheus metrics scraper. However, since the prototype was developed within
a single computation environment, much of the intercommunication depended on
the host. This posed complexities, as the loopback address or localhost in each
Docker container resolved to the container’s loopback, not the host. To counter
this issue, a distinct DNS was employed that referred to the host.
Integration Challenges
5. After setting up the ingress, the task was to direct traffic accurately to various
services. Most services had their own dedicated servers, necessitating the correct
configuration of the reverse proxy to ensure upstream servers could respond
appropriately. For instance, the Keycloak server had issues recognising specific
host paths, anticipating a root path for serving files. Both the server and proxy
needed adjustments for functionality. Similar issues arose for most services since
numerous open-source software required configuration to serve their assets on
paths other than the root.
7. Although Kiali was deployed as a elm chart, its underlying use of Prometheus
for metrics and logs retrieval made the network connection intricate. Challenges
arose due to networking complexities and Kubernetes’ inherent service accounts.
9. A principal challenge during the design of the data quantum revolved around
determining the data resolution nature from the analytical service. With the usage
of FastAPI, which is intrinsically event-driven, it is unwise to block the main
thread while continuously requesting an HTTP endpoint for events. A solution
involved a conditionally invoked function to fetch all events from a specific topic,
either upon startup or via an endpoint.
10. Another ingress-related challenge arose from the local-host-based cluster devel-
opment. To direct ingress traffic to the Kafka rest proxy, nginx was configured
for target rewriting. While this approach suits ingress network requests, egress
requests pose difficulties. For example, when the Kafka rest proxy API returned
the consumer URL, it omitted the rewrite rules, necessitating custom code to
modify the URL.
Chapter 7. Evaluation - Case Mechanism Experiment 319
11. Managing the data flow for domain B (customer reviews domain) posed chal-
lenges due to the considerable size of Yelp academic datasets, some extending to
5GB. To facilitate version control and prevent server and stream timeouts, these
JSON files required chunking. A custom Python script was devised to handle
JSON file chunking, ensuring no file exceeded 99MB.
12. SQLite, utilised in analytical services to store various data forms, posed challenges
with semi-structured data like JSON. The inconsistency in JSON data formats
in the customer domain necessitated the creation of distinct tables for diverse
semi-structured data sets. To tackle this, a custom function was developed to
dynamically generate SQLite tables.
13. An observed inefficiency emerged when initially collecting metrics from various
services using standard Python and Node APIs. It was soon discerned that
numerous traces lacked essential metadata for proper visualisation, like service
name, service version, and environment. This oversight necessitated a rerun of
scenario one for optimal visualisation and narrative coherence.
14. A related issue involved the automatic deletion of stale Kafka consumers by
the Kafka-rest-proxy service. This action resulted in service failures during
subsequent Kafka requests. To mitigate this, a custom function was developed to
rewrite the consumer and reestablish a connection to the REST server.
Performance Challenges
15. During the process of transferring operational data from the operational service to
the analytical service, storage information was dispatched alongside the success
event. This method introduced causality issues. Subsequent requests, aimed at
processing this data and registering it to Datalichen, depended on the distributed
Chapter 7. Evaluation - Case Mechanism Experiment 320
16. Psutil’s inability to accurately depict CPU usage due to Apple’s M2 chip architec-
ture posed an additional performance challenge.
7.5 Conclusion
The rigorous testing of the Metamycelium system under various scenarios has con-
clusively shown its capability to meet and, in several instances, exceed the stipulated
requirements. Based on the experimental findings, the following conclusions can be
drawn:
1. Volume: The high volume data ingestion scenario (Scenario S1) demonstrates
the system’s capability in supporting both asynchronous and batch processing,
meeting requirements Vol-1 and Vol-2. The system provides scalable storage for
large datasets, effectively handling increased data ingestion during peak times.
4. Value: The Complex Query Scenario (Scenario S4) provides evidence of the
system’s computational prowess, affirming the requirements from Val-1 to Val-
4. The system’s dexterity in supporting both batch and streaming analytical
processing and its flexibility in handling multiple output formats set it apart.
This is highlighted in the experiment, as each domain holds a different type of
semi-structured data.
5. Security & Privacy: Scenarios S5 and S6, centered around secret management
and data security respectively, robustly validate the system’s dedication to meet-
ing SaP-1 and SaP-2. The strategic use of OpenID Connect and Hashicorp
Vault ensures best practice protection, retention, and multi-level authentication
mechanisms.
In the next chapter, Metamycelium and the design theories discussed will be subject
to expert opinions to point out strengths and limitations.
Chapter 8
322
Chapter 8. Evaluation - Expert Opinion 323
8.1 Introduction
In this section, prevalent themes from expert responses are methodically identified
and analysed through a systematic coding and clustering process. This scrutiny not
only highlights recurring ideas but also contextualises the breadth of perspectives that
influence the Metamycelium RA.
Transcripts from expert opinions were thoroughly reviewed to identify recurring
Chapter 8. Evaluation - Expert Opinion 324
ideas, perspectives, and insights. Key themes were then analysed in-depth, examining
nuances, similarities, and differences within each theme, along with the underlying
rationale and implications. These prevalent themes, organised and presented in a
structured manner, enable a comprehensive exploration of the key areas of focus,
concerns, and insights offered by the expert panel.
This thematic analysis not only highlights the recurring ideas but also situates them
within the broader context, providing an understanding of the various perspectives on
the Metamycelium RA.
"In certain areas, the sense is that the artefact is slightly ahead of times with only a few
diving into micro services." - i2
"Industry trends might not align perfectly, but this is definitely industry relevant." - i1
Chapter 8. Evaluation - Expert Opinion 325
"The architecture’s complexity might challenge some organisations, even larger ones,
during adoption." - i3
The Metamycelium architecture attracted commendation for its distinctive features and
perceived strengths, particularly around the integration of domain-driven design and
data engineering.
i1 underscored the appeal of a domain-driven approach combined with data en-
gineering. Noting that traditionally, software engineering and data engineering have
disparate goals and KPIs, i1 emphasised the significance of bridging this divide.
Chapter 8. Evaluation - Expert Opinion 326
"Most of the time you were talking, you were putting software engineering and data
engineering together. It sounds really good in theory... If you manage to solve that, that
could be a good strength to have... teams working together, you don’t have a person in
the middle so you can move much faster." - i1
"I like that. That will be quite distinctive, because I’m yet to see that in action." - i1
The architecture’s flexible nature and comprehensive ecosystem also gained positive
attention from another expert. i3 particularly praised its capability to encompass
everything from batch eventing to ACLs, and the significance of having immutable logs
(refers to immutability discussed in Section 6.4.13) to establish a single source of truth.
"It considers everything from batch eventing, ACLs to infrastructure. Immutable log is
very important these days because it helps identify the single source of truth." - i3
"As a platform as a service, the fact that you can self provision components as you need
to, is very important." - i3
"I love this modular approach... I love the fact that it also calls out domain driven
design and a clear separation of data governance." - i2
i1 also delved into its combination of different data views like data mesh and event-
driven aspects, mentioning the variance from other approaches but appreciating its
unique stance.
"I think bits and pieces come from a different view. Like you have a data mesh here,
event-driven there... The approach of open policy agent looks promising." - i1
In wrapping up, the overall sentiment suggests that while certain components of
Metamycelium might find parallels in the industry, its holistic approach and integration
of domain-driven design with data engineering make it a unique offering. The challenge,
however, remains in ensuring this distinctiveness translates into practical advantages in
the evolving landscape of data operations.
Several experts provided insights into potential challenges and barriers for the Metamycelium
architecture. The concerns encompass a range of issues from skillset availability, the
complexities inherent in data governance, and scalability concerns.
i1 identified a key concern regarding the skillset required to leverage the architecture,
especially at the local level.
"Skill set, that would be the first one from the local environment. Because again, there
would be very few data engineers who would be willing to go there. Decentralising
is great, but it is usually very hard to do well, like all those governance platform
libraries..." - i1
Chapter 8. Evaluation - Expert Opinion 328
The expert further touched upon the difficulties in governance at large-scale enter-
prises and potential political influences.
"But I’d be talking big and why big enterprises. There’s always politics in it. And lots
of advice." - i1
"From a data engineering point of view... your governance is more concretely defined.
In Software, it is a bit more flexible; these two have different KPIs and that might be a
bit of friction." - i1
One specific challenge that i1 raised was related to the complexities that may arise
due to evolving ownership patterns and dependencies.
"So one of the challenges I can see happening at some point as it evolves, is the
ownership might become a bit more complicated... Now I have dependencies on A and
B, I’m not aware of." - i1
"Some companies have very large data sets and so moving to a domain-driven is kind
of wall sometimes if they have slightly monolithic systems... This data is just so big, so
hard to play with." - i2
He also pointed out challenges related to the clear ownership of data domains.
"Yeah, where they already have this centralised kind of data warehouse and they don’t
know who owns what... and which domain should own, it because it is not clear." - i2
Chapter 8. Evaluation - Expert Opinion 329
A potential problem regarding the handling of unstructured data was also flagged.
"I’m not sure how it works with things like unstructured data... But when you really get
into really, really unstructured stuff, like emails or something like that, data governance
just becomes crazy." - i2
i3 highlighted the difficulty that may arise due to the cross-functional nature of the
architecture.
"This typically won’t be owned by one team, because this is an ecosystem. You’ll have
database administrators, developers, infrastructure people, product people, service
catalog people... I think the cross functionality of it all is what’s difficult." - i3
"I think for small organisations, this is going to be quite possibly very difficult for them
to implement. What’s gonna happen is they’ll just choose some cheaper version of this
when you have not enough people to deliver this sort of stuff." - i3
The expert opinion gathering sessions also surfaced additional perspectives and consid-
erations that did not fit neatly into the primary thematic categories. These supplementary
viewpoints are discussed in this section. Various experts expressed their opinions and
perspectives on the Metamycelium architecture that fell outside the primary themes of
applicability, strengths, and challenges.
i1 mentioned the potential allure of the architecture to certain demographics:
"I can see like a startup kind of person who would want to deploy it, because it is very
intriguing, because it is challenging just in the right way." - i1
i2 delved into the broader transformation in the industry toward viewing data as
a first-class citizen, emphasising the rise of Customer Data Platforms (CDPs) and
data-driven concepts:
"I think there was a huge move much towards data as a first class citizen... Data is
a first class citizen. You’ve got to think about data first and the value that the data
provide. And so I think that as an industry or multi industry, that transformation is
really happening already." - i2
i2 also pointed out a gap in data adoption and maturity, even among prominent IT
companies:
"Yeah, data... directly like data adoption, data use. I mean, look at companies ourselves,
we are being one of the most successful IT companies in this country and data maturity
is not there. I haven’t worked yet in a company that has matured data-driven culture,
including some very big names that I’ve worked for in the past." - i2
Chapter 8. Evaluation - Expert Opinion 331
This section sheds light on a broader range of opinions and considerations about the
Metamycelium architecture, capturing nuances and insights that may not fit neatly into
other categories.
Table 8.1 summarises the responses of each expert and provides a concise representation
of their views on the applicability, strengths, and challenges of the Metamycelium
architecture.
Table 8.1: Mapping of Expert Responses Against Themes
The ‘Applicability’ column captures the experts’ assessments of how well the
Metamycelium architecture aligns with current industry practices and future trends,
indicating whether they see the architecture as having positive, mixed, or negative
applicability in real-world business settings. The ‘Strengths’ column highlights the
distinctive features and perceived advantages of the Metamycelium architecture, as
identified by the experts, summarising their perspectives on the architecture’s innovative
Chapter 8. Evaluation - Expert Opinion 332
From the coding data, it is evident that opinions and discussions varied across different
aspects of the Metamycelium architecture. The quantitative summary of these codings
is displayed in Table 8.3.
The themes "Applicability and Industrial Relevance", "Challenges and Potential
Barriers", and "Strength and Distinctive Features" have the highest number of coding
references (9 references each), signifying their prominence in the expert discussions.
Under "Applicability and Industrial Relevance", discussions on the "Alignment with
Industry Trends and Future Projection" were predominant with 6 references. The
theme "Challenges and Potential Barriers" saw equal prominence between "Adoption
Challenges for Organisations" and scenarios where the architecture may not be the best
fit. Moreover, "Other Opinions", which provided auxiliary views and considerations,
had 3 references, making it a lesser-discussed theme. These data is visualised in
Figure 8.1.
In addition, following each expert opinion gathering, experts were invited to review
each category each category, as outlined in Section A.7, is prepared and intended
for distribution to three experts. The survey structures questions on a likert scale
ranging from ‘Strongly Disagree’ to ‘Strongly Agree’ The responses from the experts,
Chapter 8. Evaluation - Expert Opinion 333
Other Opinions
Challenges and Potential Barriers\Scenarios Where The Architecture Is Not Best Fit
Applicability and Industrial Relevance\Alignment with Industry Trends and Future Projection
0 1 2 3 4 5 6 7 8 9 10
Aggregate Number of Items Coded Number of Items Coded Aggregate Number of Coding References Number of Coding References
showcasing diversity in their perspectives, are mapped to numerical values for concise
representation in the table. The mappings is as follows:
• Strongly Agree: 2
• Agree: 1
• Disagree: -1
• Strongly Disagree: -2
Expert 1 2 3 4 5 6 7 8 9 10
I1 1 1 0 1 2 1 0 2 1 1
I2 2 2 2 2 2 1 -1 1 2 2
I3 -1 1 0 -1 -1 2 2 2 1 1
The overarching sentiment from experts indicates an appreciation for the efficiency,
scalability, and effectiveness of the proposed architectural approach. Across the board,
from case experiments to expert feedbacks, there’s a consistent affirmation of the
architecture’s potential.
Reflecting on these findings in the context of the research questions, it is evident
that the expert opinions provide valuable insights into the viability, applicability, and
potential impact of the Metamycelium architecture. The feedback from experts across
different domains and industries suggests that the architecture has the potential to
Chapter 8. Evaluation - Expert Opinion 335
address the challenges of BD systems and offer a more efficient and effective approach
to data management and analytics.
Moreover, the expert feedbacks highlight the growing recognition of the need for
more advanced, domain-driven, and data-centric architectures in the industry. This trend
aligns with the broader shifts in the field towards more responsible and ethical data
practices, as evidenced by the experts’ emphasis on the importance of data governance,
privacy, and ownership.
Situating the Metamycelium architecture within the existing literature reveals its
contribution to the ongoing discourse on the design and development of BD systems.
The architecture’s unique combination of domain-driven design, event-driven communi-
cation, and modular components resonates with emerging trends and best practices in
the field (Laigner et al., 2021a; Dehghani, 2022). This positions Metamycelium as a
novel and promising approach to addressing the challenges of BD systems.
However, the expert feedbacks also underscore the importance of considering the
sociotechnical dimensions of implementing such an advanced architectural approach
in practice. The concerns raised regarding the complexity, skillset requirements, and
organisational readiness align with the findings of previous research on the adoption and
implementation of BD systems (H.-M. Chen, Kazman & Haziyev, 2016b; Mikalef et al.,
2018). These challenges highlight the need for further research and development efforts
to support the successful implementation of Metamycelium in real-world contexts.
Experts have endorsed the architecture’s ability to align well with primary assumptions,
handle varied scales of data, and deliver effective results in practical applications. The
real test of any theoretical construct lies in its tangible impact. The successful execu-
tion of tasks using this architecture, and its efficacy in handling complex operations,
Chapter 8. Evaluation - Expert Opinion 336
demonstrates that the design isn’t just theoretically sound but also practically capable.
The experts’ feedback highlights several key strengths of the architectural approach.
The modular design of the architecture was praised for its flexibility and adaptability
to various use cases and domain-specific requirements. This modularity allows for the
seamless integration of components and enables the architecture to be customised to
meet the unique needs of different industries and applications while maintaining overall
consistency and interoperability.
Another aspect that received positive feedback was the architecture’s focus on
scalability and performance. The distributed nature of the architecture, along with
its capability to handle high volumes and velocities of data, was recognised as a
significant advantage. Experts acknowledged the criticality of these features in tackling
the constantly growing demands of BD systems and ensuring their long-term viability
and efficiency.
The incorporation of event-driven communication and real-time processing capa-
bilities in the architecture was also well-received by the experts. They recognised
the increasing importance of systems that can respond to data in real-time and make
decisions based on streaming data. The Metamycelium architecture’s support for these
capabilities was seen as a crucial factor in its potential to foster innovation and enable
new applications in the BD ecosystem.
However, it is also vital to note areas of concern or potential limitations. Some
experts cautioned that the comprehensive scope of the architecture might introduce
complexity in terms of implementation and management. The successful adoption of
Metamycelium may necessitate substantial investments in resources, skills development,
and organisational change management. Addressing these challenges will be crucial to
ensure the smooth implementation and widespread adoption of the architecture.
Moreover, experts stressed the importance of continuous evolution and improvement
of the architecture based on real-world feedback and performance metrics. Given
Chapter 8. Evaluation - Expert Opinion 337
the rapid pace of change in the BD landscape, it is essential for the Metamycelium
architecture to remain flexible and adaptable to emerging technologies, standards, and
best practices. Regular updates and refinements based on user feedback and real-world
performance data will be vital to maintain the architecture’s relevance and effectiveness
over time.
Given the nascent stage of the architecture and promising results, there’s potential for
evolution. If more organisations adopt and implement the Metamycelium architecture,
valuable insights and lessons learned will emerge. Leveraging this feedback and real-
world performance analytics will be crucial to identify areas for improvement, optimise
the architecture, and ensure its continued relevance in the face of evolving BD challenges
and opportunities.
Challenges like intricate data dependencies, ownership issues, and the fluid nature
of data ecosystems are highlighted. Aspects of Metamycelium like immutability and
bi-temporality are discussed to address data evolution and dependencies. The conver-
sations brought forth the importance of understanding the role of domains, clarifying
architectural intent, and ensuring feedback loops for data quality. The balance be-
tween analytical and operational intents and their influence on interfacing with business
processes are also explored.
The experts emphasised the need for clear data lineage and provenance tracking
to understand how data evolves and flows through the system. Techniques like data
versioning and change data capture were suggested to manage data updates and maintain
a historical record of data changes.
The role of domains in the architecture was another focal point of the discussions.
Experts highlighted the benefits of aligning architectural components with business
Chapter 8. Evaluation - Expert Opinion 338
nance
As products or domains evolve, they can lead to intricate dependencies. Techniques like
streaming between domains can ensure up-to-date data flow. The conversations with
the experts emphasised data privacy, the potential of synthetic data, and the importance
of a governance layer in the architecture. Customised data governance approaches
and evolving governance requirements, especially with new types of data, necessitate
constant vigilance.
Moreover, there are various discussion in regards to applying governance policies
to unstructured data. This is particularly bold when Rego (the policy language) and
OPA are discussed. Some of the challenges discusses pivot on the fact that OPA is
mostly supporting policies in regards to security and privacy of API endpoints, service
networks and infrastructure, while data loads may introduce unstructured data types
such as image.
Along the same lines, the experts delved into the complexities of data ownership
and the implications of evolving domain dependencies. As the system grows and new
data sources are integrated, the ownership and accountability of data may become
ambiguous. Experts suggested establishing clear data ownership policies and defining
roles and responsibilities for data stewardship within each domain.
Data privacy emerged as a critical concern during the discussions. Experts em-
phasised the need for robust data protection mechanisms, such as data encryption,
access controls, and data anonymisation techniques. The potential of using synthetic
data, which mimics the statistical properties of real data without exposing sensitive
information, was also explored as a means to balance data utility and privacy.
The importance of a dedicated governance layer within the Metamycelium ar-
chitecture was strongly emphasised. Experts highlighted the need for flexible and
Chapter 8. Evaluation - Expert Opinion 340
It is paramount to ensure any new architectural approach aligns with industry practices
and requirements. Metamycelium’s architectural approach, while promising, needs to
continually refine based on insights from experts, ensuring its robustness, adaptability,
and relevance in the ever-evolving landscape of computational design.
The experts emphasised the importance of regularly assessing the alignment of the
Chapter 8. Evaluation - Expert Opinion 341
Metamycelium architecture with industry trends and best practices. As the field of BD
and computational design continues to evolve rapidly, it is crucial to stay informed
about emerging technologies, paradigms, and patterns. Regularly engaging with in-
dustry experts, participating in conferences and workshops, and monitoring relevant
publications can help ensure that the architecture remains up-to-date and relevant.
The insights gathered from experts provide valuable feedback and perspectives that
can guide the refinement and evolution of the Metamycelium architecture. It is essential
to carefully analyse and incorporate these insights into the architectural design and
implementation roadmap. This may involve adjusting existing components, introducing
new features, or refactoring certain aspects of the architecture to better align with
industry requirements and expectations. Experts also highlighted the importance of
continuous testing and validation of the architecture in real-world scenarios. As the
architecture is applied to different use cases and domains, it is crucial to gather empirical
evidence of its effectiveness, scalability, and performance. Conducting case studies,
pilot projects, and performance benchmarking can help validate the architecture’s
capabilities and identify areas for improvement.
Another key aspect emphasised by the experts is the need for the architecture to be
adaptable and extensible. As new technologies and frameworks emerge, the architecture
should be able to accommodate and integrate them seamlessly. This requires a modular
and loosely coupled design that allows for easy extension and customisation. By
providing well-defined interfaces and abstraction layers, the architecture can enable the
incorporation of new components and technologies without significant disruption to
existing systems.
The experts also stressed the importance of fostering a strong community and
ecosystem around the Metamycelium architecture. Engaging with developers, data
engineers, and researchers who are actively working with the architecture can provide
valuable insights, contributions, and support. Encouraging open-source collaboration,
Chapter 8. Evaluation - Expert Opinion 342
sharing knowledge through documentation and tutorials, and establishing forums for
discussion and feedback can help build a vibrant community that drives the continuous
improvement and adoption of the architecture.
During the discussions, specific technological solutions are highlighted. The utilisation
of technologies like OPA and the idea of connecting IAM systems to Kafka is discussed.
These insights are particularly valuable, given that two of the experts hailed from
some of the world’s leading data and eventing solutions companies. Their perspectives
shed light on possible integrations and optimisations that could further enhance the
architecture’s robustness and applicability.
In one session, an event-driven expert mentioned that his team has been receiving
a lot of requests for native integration of IAM systems into their Kafka solution. He
was impressed to see that Metamycelium is also working on achieving the same func-
tionality. Nevertheless, the expert accepted the current challenges with asynchronous
authentication and authorizaation systems.
In another session, an expert with a lot of experience in consulting for batch and
stream processing BD systems, shared the insight that the industry is currently trying
to adopt the idea of ‘stream only’, even though the phrase streaming is not that well
understood and sometimes ‘micro-batch’ processing is perceived as streaming.
An expert from a leading BD solutions provider highlighted the industry’s shift to-
wards integrating stream and batch processing within a unified architectural framework,
Chapter 8. Evaluation - Expert Opinion 344
akin to the Kappa architecture. This approach, prevalent among major BD solutions, of-
fers a unified data processing paradigm, enhancing efficiency and ensuring consistency
across various data workloads. It facilitates resource optimisation by accommodating
both stream and batch data within the same infrastructure, and supports advanced ana-
lytics and machine learning by enabling seamless integration of historical and real-time
data analysis.
Contrastingly, at Metamycelium, a different strategy is employed that focuses on
segregating computation workloads. However, the industry trend favours a singular
construct for both stream and batch processing, underscoring the move towards more
integrated, scalable, and flexible data processing architectures in response to the evolving
complexities of BD.
Building on the previous discussion, a further point of interest emerged during the
dialogue between the expert and the researcher. They delved into the concept of a
phased implementation strategy employed by companies in the realm of BD solutions.
Initially, many organisations opt for a singular architectural construct that seamlessly
integrates both batch and stream processing. This unified approach, as the expert
elucidated, serves as an effective starting point for companies due to its efficiency and
the streamlined management of diverse data types.
As these companies mature and their computational needs become more complex,
they often transition towards a bifurcated architecture. This evolution involves separat-
ing the batch and stream processing into distinct constructs. Such a transition is driven
by the need for more specialised, high-performance processing capabilities that cater to
the distinct characteristics and demands of batch versus real-time data streams.
The phased approach, therefore, offers a scalable and adaptable framework, allowing
organisations to evolve their data processing architectures in alignment with their growth
and the increasingly intricate nature of their data processing requirements. This strategy
underscores the dynamic nature of BD solutions architecture, adapting to the changing
Chapter 8. Evaluation - Expert Opinion 345
8.12 Conclusion
Discussion
347
Chapter 9. Discussion 348
9.1 Introduction
broader applicability and relevance across various contexts. Finally, the "Research
Process and Insights" are explored in Section 9.6, highlighting the technical complexities
encountered, the importance of effective communication within academic and industry
settings, and the challenges associated with developing a technically advanced artefact.
The evaluation phase, comprising a case mechanism experiment and expert consulta-
tions, provided insights into the applicability, strengths, and limitations of Metamycelium.
While the case mechanism experiment (Chapter 7) demonstrated the architecture’s capa-
bility to handle high data volumes, velocities, and varieties, it also highlighted challenges
related to data dependencies, system complexity, and the need for specialised expertise.
These findings suggest the need to evaluate the generalisability of the experimental
results to real-world BD environments and consider the impact of resource constraints
and organisational dynamics on the successful adoption of Metamycelium.
While the artefact has managed to provide good results from the case mechanism
experiments and expert opinions, it can benefit from a longitudinal empirical study that
tests the architecture in various real-world organisataions.
Future work should look more into the socio-technical aspects of adopting an architec-
ture like Metamycelium.
Moreover, given that the majority of current BD architectures are designed underly-
ing centralised architectures, as discussed in Chapter 4 and considering the ecosystem
of technologies and tools that are built to support these architectures (discussed in Chap-
ter 6), one has to think about migration challenges. While the idea of a decentralised
Chapter 9. Discussion 352
and domain-driven BD architecture may sound interesting and useful to companies, the
cost of creating a new culture, shifting to new technologies, and abandoning widely
accepted organisational practices may outweigh the benefits.
This can be analogous to companies that are moving away from monolithic software
architectures into microservices architecture and the myriad of patterns that have been
created to facilitate this process, as discussed in Chapter 4. Given this analogy, how
many patterns exist to help companies move away from centralised BD architecture
into emergent architectures like Metamycelium, Data Mesh or Data Fabric?
Given that currently most big data architectures are centralised, there is a need to
create integration patterns to help companies transition to emergent distributed data
architectures.
This raises another important point. Organisations that are not ready to absorb the
complexity of distributed systems may not fully leverage the benefits of architectures
like Metamycelium. Therefore, there is a need for a certain level of maturity. That is
maturity in technology, management, and culture.
Organisations that are not mature enough to absorb the complexity of decentralised big
data architectures, may not harness the benefits of it.
Another area of interest can be the use of Metamycelium for new AI applications
using LLMs. Metamycelium’s distributed nature and ability to scale horizontally can
be advantageous in meeting the demands required by LLMs. LLMs heavily rely on vast
amounts of training data, which need to be processed efficiently to ensure optimal model
performance (Kasneci et al., 2023). Metamycelium’s decentralised architecture can
facilitate the distributed processing of training data, enabling faster and more scalable
data preprocessing pipelines.
Chapter 9. Discussion 353
Big data architectures like metamycelium can potentially enable the development and
training of large language models in an efficient way.
Reflecting on the research process, several limitations and areas for future explo-
ration emerged. The prototyping and evaluation phases were constrained by limited
resources and the challenges of replicating a real-world BD system environment. Ad-
ditionally, while the research leveraged established methodologies, such as DSR and
empirically grounded RA development (Chapter 2), the integration of other evaluation
methods of evidence-based software engineering (Kitchenham et al., 2015) could further
strengthen the assessment of Metamycelium’s performance and scalability characteris-
tics. Future research should examine the methodological choices made in this study and
explore opportunities for enhancing the rigour and generalisability of the findings.
development of guidelines and best practices for adopting and maintaining decentralised,
domain-driven BD architectures, aligned with the principles of Metamycelium, could
facilitate wider adoption and address the skill and expertise challenges identified in this
research (Dehghani, 2022; Hechler et al., 2023).
As the field of big data continues to evolve alongside advancements in AI, ongoing
research, experimentation, and collaboration among academia and industry will be vital
in shaping the future of decentralised BD architectures.
Chapter 9. Discussion 355
The case mechanism experiment and expert opinions offer complementary perspectives
on the Metamycelium architecture, providing a nuanced understanding of its strengths
and potential areas for improvement.
The case mechanism experiment provides quantifiable results, demonstrating the
architecture’s ability to handle diverse scenarios related to data volume, velocity, variety,
security, and privacy. For example, scenario 2, focusing on high-velocity data ingestion,
showcases Metamycelium’s capability to process a continuous stream of data without
significant performance degradation. This aligns with Expert i2’s positive feedback on
the architecture’s scalability and data handling efficiency, suggesting a convergence
between empirical results and expert opinion.
However, the case mechanism experiment also reveals potential challenges and
limitations. The complexity of data dependencies and ownership issues, as well as the
resource-intensive nature of certain scenarios, highlight areas where Metamycelium may
require further optimisation and refinement. These findings resonate with the concerns
raised by experts regarding the adoption challenges for organisations, particularly in
terms of skillset requirements and the intricacies of implementing a decentralised
architecture.
Expert opinions provide valuable qualitative insights, shedding light on Metamycelium’s
perceived strengths and distinctive features. The architecture’s domain-driven approach,
emphasis on data quality, and integration of event-driven communication are iden-
tified as key strengths. These features are seen as potential differentiators, setting
Metamycelium apart from existing solutions in the BD landscape.
Nevertheless, experts also highlight potential barriers to adoption, such as the com-
plexity of the architecture, the need for specialised skills, and the challenges associated
with organisational change management. For example, proprietary components such
Chapter 9. Discussion 356
as Data Lichen do not have an open source equivalent in the industry, which makes it
harder for companies to adopt Metamycelium. Furthermore, Metamycelium does not
only introduce an architectural change; it can potentially introduce an operation change
to the companies, which can be perceived as costly.
These concerns underscore the importance of considering the socio-technical di-
mensions of implementing a decentralised architecture like Metamycelium, beyond its
technical merits.
To situate the findings from the case mechanism experiment and expert opinions within
the broader context of BD architectures, it is essential to consider the current state of
practice and academia.
The limitations identified in the case mechanism experiment, such as the complexity
of data dependencies and ownership issues, align with the challenges faced by organi-
sations in managing and processing large-scale, heterogeneous data using traditional
monolithic data pipelines (Gorton & Klein, 2015; Nadal, Herrero, Romero, Abelló et
al., 2017). The emergence of decentralised architectures, such as Data Mesh (Dehghani,
2022) and Data Fabric (Hechler et al., 2023), reflects a growing recognition of the need
for more flexible, scalable, and domain-driven approaches to BD management.
Metamycelium’s domain-driven, decentralised approach resonates with these emerg-
ing paradigms, offering a potential solution to the limitations of centralised architectures.
The positive feedback from experts regarding Metamycelium’s alignment with industry
trends and its potential to address current challenges in BD management supports this
notion.
However, the concerns raised by experts about the complexity and adoption chal-
lenges associated with Metamycelium also find parallels in the literature. Much
Chapter 9. Discussion 357
like the challenges experienced during the industry’s transition from monolithic soft-
ware architectures to microservices, implementing decentralised BD architectures like
Metamycelium requires significant organisational and cultural changes, as well as in-
vestments in skills development and technology infrastructure (Hechler et al., 2023).
These challenges are not unique to Metamycelium but reflect the broader difficulties
faced by organisations in transitioning from traditional data management practices to
more agile, domain-driven approaches.
Further, the SLR on BD RAs (Chapter 4) underscores the need for more research
on domain-driven, decentralised approaches, as well as the importance of address-
ing cross-cutting concerns such as data quality, metadata management, and security.
Metamycelium’s emphasis on these aspects aligns with the identified research gaps
and priorities in the field, indicating its potential to bridge the gap between academic
research and industry practice.
However, some of the limitations of the study, such as the controlled nature of the
case mechanism experiment and the potential biases in expert opinions, underscore
the need for further empirical validation and investigation. Future research should
aim to address these limitations by conducting more diverse and robust evaluations of
Metamycelium, such as multiple case studies, longitudinal studies, and quantitative
assessments of its performance and scalability in real-world settings.
Contextualising the findings from the case mechanism experiment and expert opin-
ions within the current state of practice and academia reveals the relevance and potential
contributions of Metamycelium to the field of BD architectures. While the architec-
ture aligns with emerging trends and addresses identified challenges, it also faces
adoption barriers and requires further empirical validation. Addressing these aspects
through ongoing research and real-world implementations will be crucial in establishing
Metamycelium’s position as a viable solution for organisations seeking to harness the
power of BD effectively.
Chapter 9. Discussion 358
The case mechanism experiment provides quantifiable results, while expert opinions
offer a qualitative perspective, adding depth to the evaluation. For example, the experi-
ments showed success rates, and the expert opinions highlighted the “cross-functional
nature and the ecosystem approach” as one of the architecture’s core strengths. Such
qualitative insights provide context that numbers alone might miss. However, it is
essential to critically examine the alignment between these two evaluation methods and
assess the validity of the conclusions drawn from their comparison.
The quantitative results from the case mechanism experiment demonstrate Metamycelium’s
ability to handle various scenarios related to data volume, velocity, variety, security, and
privacy. These findings suggest the architecture’s potential to address key challenges
in BD systems. However, it is crucial to acknowledge that these experiments were
conducted in a controlled environment, which may not fully reflect the complexities and
constraints of real-world implementations (Mikalef et al., 2018). Moreover, the case
mechanism experiment, while valuable, may not fully capture the nuances of real-world
implementations where factors like organisational culture, legacy systems, and data
governance practices could influence the architecture’s performance.
On the other hand, the qualitative insights from expert opinions provide a more
nuanced understanding of Metamycelium’s potential benefits and challenges, such as its
“domain-driven nature and data as first class citizen approach”, which are not directly
captured by the quantitative metrics of the case mechanism experiment. Nevertheless,
expert opinions are subject to individual biases and may not represent a comprehensive
view of the architecture’s applicability across different domains and industries (Creswell
et al., 2007). It is important to consider the potential for confirmation bias in expert
opinions, where experts may be predisposed to certain architectural styles.
Both methodologies identified challenges such as data dependencies and ownership
Chapter 9. Discussion 359
issues. Utilising a formal logical framework, let (P) represent the success rate in the
experiment and (Q) the positive feedback from experts. If (P ∧Q) is true, as the findings
suggest, the conjunction indicates a robust validation of the architecture’s strengths and
its applicability in real-world scenarios.
Specifically:
• The high volume data ingestion focus of scenario S1 aligns with expert feedback
on the architecture’s scalability and data handling efficiency (ELargeScale ). This
correlation (S1 → ELargeScale ) affirms the architecture’s capability in managing
large-scale data operations. However, this correlation should be interpreted with
caution, as the controlled environment of the experiment may not fully capture
the challenges and constraints of real-world large-scale data ingestion (Gorton
& Klein, 2015). Further investigation is needed to assess the scalability of
Metamycelium under varying real-world conditions, such as heterogeneous data
requirements for training LLMs.
• Scenarios S4, S5, and S6, which centred on complex query processing and
data security, correlate with expert observations on the architecture’s strength
in data governance and innovative security management (EPrivacySecurity ). This
effective data integrity management in these scenarios supports the experts’
positive views, leading to (S4S5S6 → EPrivacySecurity ). Nevertheless, the logical
implication should be further validated through real-world implementations and
case studies to assess its robustness and generalisability (Nadal, Herrero, Romero,
Abelló et al., 2017). A comprehensive security assessment, including penetration
testing and vulnerability analysis, could provide additional evidence for the
architecture’s resilience against potential threats.
The comparative analysis of the case mechanism experiment and expert opinions
offers valuable insights into Metamycelium’s strengths and potential challenges. How-
ever, a critical examination of the alignment between the two evaluation methods
reveals the need for further validation and investigation to establish the robustness and
generalisability of the findings. By addressing the limitations of the current analysis
and conducting more comprehensive evaluations, future research can contribute to a
deeper understanding of Metamycelium’s potential as a domain-driven, decentralised
architecture for BD systems.
consideration of data privacy, security, and the ethical implications of using powerful
generative models, especially with regards to potential misuse or bias.
However, the effective deployment and utilisation of LLMs require scalable and
efficient BD architectures to handle the massive amounts of data involved in training
and inference processes. Metamycelium’s decentralised and domain-driven approach,
with its focus on modularity and scalability, could potentially address these challenges
and provide a suitable foundation for integrating LLMs into BD systems.
The domain-driven design of Metamycelium could also facilitate the development
of specialised LLMs tailored to specific domains or tasks, improving their accuracy and
efficiency (Y. Chang et al., 2024). However, the integration of LLMs into decentralised
BD architectures also poses challenges.
Issues such as data privacy, security, and the ethical implications of AI-driven
decision-making need to be carefully considered. Metamycelium’s emphasis on data
governance and security provides a foundation for addressing these concerns. However,
further research is needed to explore how decentralised architectures can effectively
manage the unique challenges and opportunities presented by LLMs and AI in general.
Metamycelium’s positioning within the existing body of knowledge on BD ar-
chitectures is further solidified by its alignment with emerging industry trends. The
architecture’s emphasis on decentralised data ownership and governance, as well as its
modular design, resonates with the growing interest in data mesh and microservices
architectures (Dehghani, 2022).
While Metamycelium is not explicitly a data mesh or microservices implementation,
it incorporates key principles from both paradigms, demonstrating the potential for cross-
fertilisation of ideas between academic research and industry practice. Future research
could explore the synergies between Metamycelium and these emerging architectures,
potentially leading to the development of more comprehensive and adaptable solutions
for BD management.
Chapter 9. Discussion 365
The generalisability and transferability of the findings from this research are subject
to certain considerations. While the case mechanism experiment provided empirical
evidence of Metamycelium’s capabilities within a controlled setting, its applicability in
diverse real-world contexts requires further investigation. Generalisability, as defined
by Hellström (2006), refers to the extent to which research findings can be applied to a
larger population or context beyond the sample studied.
In this case, while the experiment demonstrated the architecture’s effectiveness in
handling various data scenarios, the controlled nature of the experiment limits the extent
to which these findings can be generalised to the broader landscape of BD systems,
which may encompass different organisational structures, data governance practices,
and technological constraints.
Furthermore, the expert consultations, while providing valuable insights into Metamycelium’s
potential benefits and challenges, are inherently limited in their generalisability. The
opinions of the experts consulted, although insightful, represent a specific subset of per-
spectives and may not be universally applicable to all potential users of the architecture.
The transferability of the findings, defined as the extent to which they can be
applied to similar contexts or situations, also warrants further investigation. While
Metamycelium’s decentralised and domain-driven approach aligns with emerging in-
dustry trends like Data Mesh and Data Fabric, the specific challenges and opportunities
associated with implementing such an architecture may vary across different domains
and industries.
To enhance the generalisability and transferability of the research findings, future
studies could focus on replicating the case mechanism experiment in diverse real-world
settings, involving different organisations and data ecosystems. Additionally, a broader
range of expert opinions could be sought, encompassing diverse perspectives from
Chapter 9. Discussion 366
various industries and domains. Furthermore, the Technology Acceptance Model (TAM)
(Davis, 1985) could provide a valuable framework for assessing the transferability
of Metamycelium and understanding the factors influencing its adoption in diverse
organizational settings. TAM posits that perceived usefulness and perceived ease of use
are critical determinants of technology acceptance.
While this study provides insights into Metamycelium’s potential as a decentralised
BD architecture, further research is needed to establish its generalisability and transfer-
ability across diverse real-world scenarios. By addressing the limitations of the current
study and incorporating a wider range of perspectives and contexts, future research can
contribute to a more comprehensive understanding of Metamycelium’s applicability
and impact in the broader landscape of BD systems.
The code used in the case mechanism experiment has been made publicly available.
Public access to the code allows for independent verification and reproducibility of
the results. However, the successful replication of findings and adaptation of the
Metamycelium architecture across diverse contexts may be influenced by factors beyond
code accessibility alone. While public availability contributes to transparency and
reproducibility, it does not inherently guarantee generalisability or transferability of the
research.
Various iterations of the artefact and the design theories that underpin the artefact,
underwent several rigorous peer-review processes, with submissions to multiple top-
tier academic venues, including esteemed journals and conferences like IEEE Access
(“IEEE Access”, 2024) and America’s Conference on Information Systems (“Americas
Chapter 9. Discussion 367
This section delves into the research process of BD RAs, highlighting the technical
complexities and insights encountered. It emphasises the importance of effective
communication within academic and industry contexts and the challenges of developing
a technically advanced artefact.
Chapter 9. Discussion 369
The initial iterations of the artefact highlighted a potential disconnect between the
study’s intended objectives and the interpretations drawn by some audiences. This
discrepancy underscores the challenges inherent in communicating complex technical
research to a diverse readership with varying levels of expertise and familiarity with
emerging concepts. The technical depth of the research, while essential for rigour and
Chapter 9. Discussion 370
validity, may have inadvertently obscured the practical implications of the proposed
architecture for some audiences.
This disconnect also highlights the need to bridge the gap between theoretical
advancements in BD RAs and their real-world applications, especially in comparison
to rapidly evolving fields like AI. Effectively communicating the potential impact
of Metamycelium in practical settings may require a more explicit articulation of its
potential benefits and challenges, particularly for stakeholders less familiar with the
technical nuances of BD architectures.
In professional circles, the concept of RAs for BD systems, while recognised, often
requires a more detailed explanation. This need for clarification suggests a potential
disconnect between the theoretical understanding of RAs within academia and their
practical application in industry. While the academic community is generally familiar
with the concept and benefits of RAs, industry professionals may have varying levels of
understanding and adoption of such architectures in their BD practices.
The expert consultations conducted as part of this research (Chapter 8) revealed a
range of perspectives on RAs. Some experts acknowledged the potential value of RAs
in providing a structured approach to BD system design and implementation, while
others expressed concerns about their complexity and potential lack of flexibility in
addressing unique organisational requirements. Additionally, the need for specialised
expertise and resources to implement and maintain RAs was identified as a potential
barrier to adoption in some industrial settings.
The observed need for clarification on the concept of RAs in industry highlights
an opportunity for greater collaboration between academia and industry to bridge
the gap between theoretical knowledge and practical implementation. This could
Chapter 9. Discussion 371
9.7 Conclusion
The in-depth study of BD RAs, particularly the Metamycelium architecture, has un-
derscored its potential to redefine domain-driven distributed systems. The analysis,
bridging empirical case findings with expert perspectives, establishes the Metamycelium
Chapter 9. Discussion 372
Conclusion
10.3 The Big Data Landscape Links to Chapter 4 (Big Data Reference Architectures)
Chapter 10: Conclusion 10.4 Metamycelium: A Solution in Context Links to Chapter 6 (Design)
10.7 Conclusion
10.1 Introduction
Following the detailed discussion in the previous chapter, this chapter concludes the
thesis by encapsulating the key aspects and outcomes of the research. It revisits the
initial problem statement and research questions, outlining how the study addressed
373
Chapter 10. Conclusion 374
these through its methodologies and findings. Here, the focus is on synthesising the
entire journey, from inception to conclusion, and reflecting on the implications of the
research within the field of BD RAs. The chapter also acknowledges the limitations
encountered during the study and proposes avenues for future research, aiming to further
the discourse in the ever-evolving domain of BD systems.
The conclusion chapter commences with an introduction that sets the stage for
the synthesis of the research journey and its implications within the field of BD RAs.
Section 10.2 recapitulates the foundational aspects of BD that motivated this study,
highlighting the importance and challenges of BD, and restating the research questions
aimed at addressing BD project failures. The current landscape of BD is summarised
in Section 10.3, emphasising the gaps and challenges identified through the SLRs.
Metamycelium, as a solution to these challenges, is discussed in the context of ex-
isting architectures in Section 10.4, highlighting its novel aspects and potential to
address the limitations of current BD RAs. Section 10.5 illustrates the unique contribu-
tions of this study to the field of research. Section 10.6 acknowledges the limitations
encountered during the study, encompassing methodological constraints of literature re-
views, research methodology limitations, and the evolutionary journey of RAs. Finally,
Section 10.7 proposes avenues for future research, exploring areas such as empirical
validation, socio-technical considerations, integration with emerging technologies, and
the development of implementation guidelines and best practices, aiming to further the
discourse in the ever-evolving domain of BD systems.
10.2 Recapitulation
This section revisits the foundational aspects of BD that motivated this study, incorpo-
rating relevant literature to highlight the challenges in BD project implementation and
the context for the research questions addressed.
Chapter 10. Conclusion 375
The advent of digital technologies has catalysed an unprecedented surge in data gener-
ation, marking the onset of the BD era. This development underscores the necessity
of advanced data management systems adept at handling the complexities and sheer
volume of contemporary data.
The importance of BD lies in its potential to reveal valuable insights and patterns
that can drive business decisions and innovation (Lycett, 2013), enable organisations to
better understand and serve their customers (Bughin, 2016), and optimise operations
and improve efficiency across various industries (Manyika et al., 2011).
Despite the potential that BD holds, its implementation is fraught with challenges.
The complexity of data architectures often struggles to handle the volume, velocity, and
variety of BD (H. Jagadish et al., 2014). The rapid pace of technological change makes
it difficult for organisations to keep up with the latest tools and techniques (H.-M. Chen,
Schütz, Kazman & Matthes, 2017). Moreover, organisational and cultural barriers to
becoming data-driven, such as silos and resistance to change, pose significant hurdles
(Mikalef et al., 2018).
These challenges contribute to the high failure rate of BD projects. Studies such
as those by technology review insights in partnership with Databricks (2021) and
NewVantage (2021) underscore the prevalence of challenges in achieving successful
BD outcomes, with failure rates ranging from 60% to 85%.
The SLR on BD RAs (Chapter 4) identifies several key failure points, including
the reliance on monolithic data pipeline architectures, which can lead to scalability
issues, data quality concerns, and difficulties in adapting to rapid technological changes
(Gorton & Klein, 2015; Nadal, Herrero, Romero, Abelló et al., 2017). Additionally,
the review highlights the lack of attention to cross-cutting concerns, data silos, lack of
domain ownership, and difficulties in maintaining data lineage.
Chapter 10. Conclusion 376
These failure points and architectural challenges highlight the need for effective BD
management strategies and architectures that can address the technical and organisa-
tional challenges of implementing BD systems. The research questions posed in this
thesis (Section 2.3) aim to address these challenges by proposing a novel, domain-driven,
and decentralised RA for BD systems.
Company, 2024) aiming to address scalability, agility, and governance challenges. Data
Mesh promotes decentralised data ownership by individual domains, enhancing data
quality and agility. Data Fabric provides unified access and management of disparate
data sources, simplifying integration and sharing. DataOS offers operating principles
for managing data as a strategic asset, improving quality, accessibility, and governance.
These newer approaches align with the principles of Metamycelium for more scalable
and agile architectures in the BD landscape.
This study emphasises the need for an intensified focus on RAs within the BD
sphere from both academic and industrial perspectives. The identification of challenges
and limitations in current BD RAs underscores the importance of further research in
this area to develop more robust and scalable BD systems.
network of data nodes, each addressing specific analytical or operational tasks. This
setup not only helps maintain data integrity by keeping it close to its source but also
allows for efficient data exchanges and updates across the network. The result is a
system that is more adaptable and resilient to errors.
Metamycelium’s approach addresses common bottlenecks found in traditional data
architectures, leading to a more dynamic management of data. It provides timely and
relevant data access to various stakeholders, from data engineers to software developers,
aligning with the needs of contemporary data-driven decision-making processes.
Taken all into consideration, Metamycelium offers an approach to BD management
that seeks to improve upon traditional data management strategies. Its focus on decen-
tralisation and domain-driven design presents a new pathway in the ongoing evolution
of BD systems.
This study makes several unique contributions to the field of BD RAs and the broader
domain of software engineering for BD systems:
process and the challenges encountered, this study provides valuable insights for
researchers looking to evaluate RAs in the context of DSR.
10.6 Limitations
This section acknowledges and addresses the limitations encountered throughout the
research process, encompassing methodological constraints, and research methodology
limitations, as well as the limitations identified from the discussion and literature review
chapters of the thesis.
The case mechanism experiment and expert opinions may have been based on
a relatively small sample size, which could limit the generalisability of the findings.
Moreover, the adoption of the Galster and Avgeriou (2011b) empirically grounded RA
methodology, while appropriate for the study’s objectives, may have inherent limitations
in terms of empirical validation and real-world testing at scale.
Lastly, the study acknowledges resource constraints, such as limited access to
specialised tools, datasets, and computing resources, which could have impacted the
depth and breadth of the evaluation and testing phases. Additionally, the reliance on
custom-developed components (e.g., Data Lichen) due to resource constraints might
raise questions about the sustainability and scalability of such solutions in real-world
implementations.
Chapter 10. Conclusion 382
Moreover, the exploration of privacy and security in BD RAs remains a critical area.
With the increasing focus on data privacy regulations like GDPR, developing secure
components for data scrubbing and enhancing pipeline security is essential.
Finally, future BD RA research should strive to address the biases and limitations in
current architectures influenced by specific technological offerings. Embracing a more
inclusive approach and exploring a wide range of technological solutions could avoid
limiting architectural choices and promote a more versatile and adaptive BD ecosystem.
In conclusion, future research in BD RAs should explore decentralised, domain-
driven systems, metadata management, data communication standards, and privacy
and security issues. This holistic approach is essential for advancing BD management
strategies, offering balanced and multifaceted solutions to the complex challenges in
the field.
10.8 Conclusion
The research focused on the Metamycelium architecture highlights a shift in the domain
of BD systems towards decentralised architectures. These architectures address the
complexities of the contemporary data landscape, making them increasingly relevant.
The novel design of the Metamycelium framework is well-suited to meet these evolving
data management needs.
The methodological rigour of this study, which included literature reviews and
feedback from experts, enhanced the validity of its findings. This approach ensured
a detailed and objective evaluation of the Metamycelium framework, affirming its
applicability.
Nevertheless, it is essential to recognise the inherent limitations and potential biases
within the research methodology. Acknowledging these aspects is crucial for a balanced
understanding of the study’s contributions and for pinpointing areas that necessitate
Chapter 10. Conclusion 384
further research.
Reflecting on the research process, several key insights emerge. The BD systems
landscape is in a state of constant evolution, necessitating continuous innovation and
refinement. While the Metamycelium architecture marks a step forward in this land-
scape, ongoing exploration and adaptation are imperative. Empirical validation of the
framework in diverse real-world contexts is essential to confirm its effectiveness and
identify areas for improvement. Future research efforts, benefiting from collaborative
approaches, will likely yield richer perspectives and more comprehensive outcomes.
Looking ahead, the BD systems field presents a range of opportunities and chal-
lenges. Insights from this research can guide future studies, enabling them to navigate
these challenges effectively and contribute to the development of more dynamic, effi-
cient, and resilient BD systems.
References
2022 gartner® magic quadrant™ for analytics and business intelligence platforms
(Tech. Rep.). (2022). Gartner. Retrieved from https://www.tableau
.com/asset/gartner-magic-quadrant-2022 (Accessed: 2023-06-
03)
Aboulafia, M. (1991). Philosophy, social theory, and the thought of george herbert
mead. SUNY Press.
Abran, A., Moore, J. W., Bourque, P., Dupuis, R. & Tripp, L. (2004). Software
engineering body of knowledge. IEEE Computer Society, Angela Burgess, 25.
Ainsworth, S. & Jones, T. M. (2020). Prefetching in functional languages. Proceedings
of the 2020 ACM SIGPLAN International Symposium on Memory Management.
doi: 10.1145/3381898.3397209
Akhtar, P., Frynas, J. G., Mellahi, K. & Ullah, S. (2019). Big data-savvy teams’ skills,
big data-driven actions and business performance. British Journal of Management,
30(2), 252–271.
Aksakalli, I. K., Çelik, T., Can, A. B. & Tekinerdogan, B. (2021). Deployment and
communication patterns in microservice architectures: A systematic literature
review. Journal of Systems and Software, 180, 111014.
Al-Jaroodi, J. & Mohamed, N. (2016). Characteristics and requirements of big data
analytics applications. In 2016 ieee 2nd international conference on collaboration
and internet computing (cic) (pp. 426–432).
Almaatouq, A., Shmueli, E., Nouh, M., Alabdulkareem, A., Singh, V. K., Alsaleh, M.,
. . . Alfaris, A. (2016). If it looks like a spammer and behaves like a spammer,
it must be a spammer: analysis and detection of microblogging spam accounts
[Journal Article]. International Journal of Information Security, 15(5), 475-491.
doi: 10.1007/s10207-016-0321-5
Amatriain, X. (2013). Beyond data: from user information to business value through
personalized recommendations and consumer science [Conference Proceedings].
In (p. 2201-2208). ACM. doi: 10.1145/2505515.2514691
Amazon Web Services. (2024). Reference architecture 2 - analytics lens.
https://docs.aws.amazon.com/wellarchitected/latest/
analytics-lens/reference-architecture-2.html. (Accessed:
2024-03-23)
Amazon Web Services, I. (2023). Big data analytics options on aws (White
Paper). Author Retrieved from https://docs.aws.amazon.com/
385
References 386
pdfs/whitepapers/latest/big-data-analytics-options/
big-data-analytics-options.pdf#welcome
Americas conference on information systems,amcis. (2024). Retrieved from https://
aisnet.org/page/AMCIS
Angelov, S. & Grefen, P. (2008a). An e-contracting reference architecture. Journal of
Systems and Software, 81(11), 1816–1844.
Angelov, S. & Grefen, P. (2008b). An e-contracting reference architecture [Journal
Article]. Journal of Systems and Software, 81(11), 1816-1844. doi: 10.1016/
j.jss.2008.02.023
Angelov, S., Grefen, P. & Greefhorst, D. (2009). A classification of software reference
architectures: Analyzing their success and effectiveness. In 2009 joint working
ieee/ifip conference on software architecture & european conference on software
architecture (pp. 141–150).
Angelov, S., Grefen, P. & Greefhorst, D. (2012). A framework for analysis and design
of software reference architectures. Information and Software Technology, 54(4),
417–431.
Angelov, S., Trienekens, J. J. & Grefen, P. (2008). Towards a method for the evaluation
of reference architectures: Experiences from a case. In European conference on
software architecture (pp. 225–240).
ANSI, A. (1975). X3/sparc study group on dbms, interim report [Journal Article].
SIGMOD FDT Bull, 7(2).
Apache. (2023a). Apache airflow: Platform to programmatically author, schedule, and
monitor data pipelines. Retrieved from https://airflow.apache.org/
Apache. (2023b). Apache projects directory. Retrieved from https://projects
.apache.org/ (Accessed: 2023-06-03)
Apache. (2023c). Hadoop: Open-source software for reliable, scalable, distributed
computing. Retrieved from https://hadoop.apache.org/
Apollo GraphQL. (2023). Apollo federation. https://www.apollographql
.com/docs/federation/. (Accessed: May 10, 2023)
Arnautov, S., Trach, B., Gregor, F., Knauth, T., Martin, A., Priebe, C., . . . others (2016).
Scone: Secure linux containers with intel sgx. In Osdi (Vol. 16, pp. 689–703).
Asur, S. & Huberman, B. A. (2010). Predicting the future with social media. In 2010
ieee/wic/acm international conference on web intelligence and intelligent agent
technology (Vol. 1, pp. 492–499). doi: 10.1109/wi-iat.2010.63
Ataei, P. & Litchfield, A. (2020). Big data reference architectures: A systematic
literature review. In 2020 31st australasian conference on information systems
(acis) (pp. 1–11). doi: 10.5130/acis2020.bf
Ataei, P. & Litchfield, A. (2021a). Neomycelia: A software reference architecturefor
big data systems. In 2021 28th asia-pacific software engineering conference
(apsec) (pp. 452–462).
Ataei, P. & Litchfield, A. (2021b, dec). Neomycelia: A software reference architecture-
for big data systems. In 2021 28th asia-pacific software engineering conference
(apsec) (p. 452-462). Los Alamitos, CA, USA: IEEE Computer Society. Re-
trieved from https://doi.ieeecomputersociety.org/10.1109/
References 387
Bucchiarone, A., Dragoni, N., Dustdar, S., Lago, P., Mazzara, M., Rivera, V. &
Sadovykh, A. (2020). Microservices. Science and Engineering. Springer.
Bughin, J. (2016). Big data, big bang? [Journal Article]. Journal of Big Data, 3(1), 2.
doi: 10.1186/s40537-015-0014-3
Building a high-performance data and ai organization (Tech. Rep.). (2023). MIT Tech-
nology Review Insights. Retrieved from https://www.databricks.com/
resources/whitepaper/mit-technology-review-insights
-report (Accessed: 2023-06-03)
Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P. & Stal, M. (2008). Pattern-
oriented software architecture: A system of patterns, volume 1 (Vol. 1). John
wiley & sons.
Cackett, D. (2013). Information management and big data, a reference
architecture. Oracle: Redwood City, CA, USA. Retrieved from
https://www.oracle.com/technetwork/topics/entarch/
articles/info-mgmt-big-data-ref-arch-1902853.pdf
Capterra. (2023). Latest software categories. Retrieved from https://www
.capterra.com/categories (Accessed: 2023-06-03)
Carroll, J. M. (1995). Scenario-based design: Envisioning work and technology in
system development. Wiley.
Chainey, S., Tompson, L. & Uhlig, S. (2008). The utility of hotspot mapping for
predicting spatial patterns of crime [Journal Article]. Security journal, 21(1-2),
4-28. doi: 10.1057/palgrave.sj.8350066
Chang, W. L. & Boyd, D. (2018). Nist big data interoperability framework: Volume
6, big data reference architecture (Technical Report). Gaithersburg, MD, USA:
National Institute of Standards and Technology (NIST).
Chang, W. L., Grady, N. et al. (2015). Nist big data interoperability framework: volume
1, big data definitions (Tech. Rep.). Gaithersburg, MD, USA: National Institute
of Standards and Technology.
Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., . . . others (2024). A survey
on evaluation of large language models. ACM Transactions on Intelligent Systems
and Technology, 15(3), 1–45.
Chen, H., Chiang, R. H. & Storey, V. C. (2012). Business intelligence and analytics:
From big data to big impact [Journal Article]. MIS quarterly, 36(4), 1165. doi:
10.2307/41703503
Chen, H.-M., Kazman, R., Garbajosa, J. & Gonzalez, E. (2017). Big data value engineer-
ing for business model innovation. HAWAII INTERNATIONAL CONFERENCE
ON SYSTEM SCIENCES.
Chen, H.-M., Kazman, R. & Haziyev, S. (2016a). Agile big data analytics development:
An architecture-centric approach [Conference Proceedings]. In 2016 49th hawaii
international conference on system sciences (hicss) (p. 5378-5387). IEEE. doi:
10.1109/hicss.2016.665
Chen, H.-M., Kazman, R. & Haziyev, S. (2016b). Agile big data analytics for web-based
systems: An architecture-centric approach [Journal Article]. IEEE Transactions
on Big Data, 2(3), 234-248. doi: 10.1109/tbdata.2016.2564982
References 390
Chen, H.-M., Kazman, R., Haziyev, S. & Hrytsay, O. (2015). Big data system devel-
opment: An embedded case study with a global outsourcing firm [Conference
Proceedings]. In Proceedings of the first international workshop on big data
software engineering (p. 44-50). IEEE Press. doi: 10.1109/bigdse.2015.15
Chen, H.-M., Kazman, R. & Matthes, F. (2016). Demystifying big data adoption: Be-
yond it fashion and relative advantage [Conference Proceedings]. In Proceedings
of pre-icis (international conference on information system) digit workshop. doi:
10.1109/hicss.2016.631
Chen, H.-M., Schütz, R., Kazman, R. & Matthes, F. (2017). How lufthansa capitalized
on big data for business model renovation [Journal Article]. MIS Quarterly
Executive, 16(1). doi: 10.24251/hicss.2017.713
Chen, L. & Babar, M. A. (2011). A systematic review of evaluation of variability
management approaches in software product lines. Information and Software
Technology, 53(4), 344–362.
Chen, M., Yang, J., Zhou, J., Hao, Y., Zhang, J. & Youn, C.-H. (2018). 5g-smart
diabetes: Toward personalized diabetes diagnosis with healthcare big data clouds
[Journal Article]. IEEE Communications Magazine, 56(4), 16-23. doi: 10.1109/
mcom.2018.1700788
Chen, Y.-S. (2018). E-business and big data strategy in franchising [Book Section].
In Encyclopedia of information science and technology, fourth edition (p. 2686-
2696). IGI Global. doi: 10.4018/978-1-5225-2255-3.ch234
Cherryholmes, C. H. (1992). Notes on pragmatism and scientific realism. Educational
researcher, 21(6), 13–17.
Chua, W. F. (1986). Radical developments in accounting thought. Accounting review,
601–632.
Clandinin, D. J. & Connelly, F. M. (2004). Narrative inquiry: Experience and story in
qualitative research. John Wiley & Sons.
Clements, P., Garlan, D., Little, R., Nord, R. & Stafford, J. (2003). Documenting
software architectures: views and beyond. In 25th international conference on
software engineering, 2003. proceedings. (pp. 740–741).
Cloutier, R., Muller, G., Verma, D., Nilchiani, R., Hole, E. & Bone, M. (2010a). The
concept of reference architectures [Journal Article]. Systems Engineering, 13(1),
14-27.
Cloutier, R., Muller, G., Verma, D., Nilchiani, R., Hole, E. & Bone, M. (2010b). The
concept of reference architectures [Journal Article]. Systems Engineering, 13(1),
14-27. doi: 10.2514/6.2017-5118
Coblenz, M., Sunshine, J., Aldrich, J., Myers, B. A., Weber, S. & Shull, F. (2016). Ex-
ploring language support for immutability. Proceedings of the 38th International
Conference on Software Engineering. doi: 10.1145/2884781.2884798
Cohen, J. (1960). A coefficient of agreement for nominal scales [Journal Article].
Educational and Psychological Measurement, 20(1), 37–46. doi: 10.1177/
001316446002000104
Collaborative, N. (2023). U.s. climate resilience toolkit: Climate explorer. (Available
from: https://crt-climate-explorer.nemac.org/)
References 391
Gan, Y., Zhang, Y., Cheng, D., Shetty, A., Rathi, P., Katarki, N., . . . others (2019).
An open-source benchmark suite for microservices and their hardware-software
implications for cloud & edge systems. In Proceedings of the twenty-fourth
international conference on architectural support for programming languages
and operating systems (pp. 3–18).
Gartner. (2021). Magic quadrant. https://www.gartner.com/en/
research/methodologies/research-methodologies-and
-processes/magic-quadrants-research. (Accessed: March 26,
2023)
Geerdink, B. (2013). A reference architecture for big data solutions introducing a model
to perform predictive analytics using big data technology. In 8th international
conference for internet technology and secured transactions (icitst-2013) (pp.
71–76).
GitHub. (2023). Explore github. Retrieved from https://github.com/explore
(Accessed: 2023-06-03)
GitOps. (2023). https://www.gitops.tech/. (Accessed: April 19, 2023)
Gohil, A., Modi, H. & Patel, S. K. (2013). 5g technology of mobile communication: A
survey [Conference Proceedings]. In 2013 international conference on intelligent
systems and signal processing (issp) (p. 288-292). IEEE. doi: 10.1109/issp.2013
.6526920
Goldman, O. (2024). Effective software architecture: Building better software faster
(1st ed.). United States: Addison-Wesley Professional.
Gorelik, A. (2019). The enterprise big data lake: Delivering the promise of big data
and data science. O’Reilly Media.
Gorton, I. & Klein, J. (2015). Distribution, data, deployment [Journal Article]. STC
2015, 78.
Graaf, B., Van Dijk, H. & Van Deursen, A. (2005). Evaluating an embedded software
reference architecture-industrial experience report. In Ninth european conference
on software maintenance and reengineering (pp. 354–363).
Greefhorst, D. (1999). Een applicatie-architectuur voor het web bij de bank—de pro’s
en contra’s van toestandsloosheid. Software Release Magazine, 2.
Greefhorst, D. & Gehner, P. (2006). Achmea streamlines application development and
integration. Via Nova Architectura.
Gregg, B. (2014). Systems performance: enterprise and the cloud. Pearson Education.
Guo, L. & Vargo, C. (2015). The power of message networks: A big-data analysis of
the network agenda setting model and issue ownership [Journal Article]. Mass
Communication and Society, 18(5), 557-576. doi: 10.1080/15205436.2015
.1045300
Haller, P. & Axelsson, L. (2017). Quantifying and explaining immutability in scala.
Electronic Proceedings in Theoretical Computer Science, 246, 21-27. doi: 10
.4204/eptcs.246.5
Han, J., Haihong, E., Le, G. & Du, J. (2011). Survey on nosql database [Conference
Proceedings]. In 2011 6th international conference on pervasive computing and
applications (p. 363-366). IEEE. doi: 10.1109/icpca.2011.6106531
References 395
Iso, I. (2011). Iec25010: 2011 systems and software engineering–systems and software
quality requirements and evaluation (square)–system and software quality models
[Journal Article]. International Organization for Standardization, 34, 2910.
ISO, I. (2016). Information technology — reference architecture for service oriented
architecture (soa ra) — part 1: Terminology and concepts for soa. International
Organization for Standardization, 51. Retrieved from https://www.iso
.org/standard/63104.html
ISO/IEC. (2018). Iso/iec 29148:2018. systems and software engineering — life cycle
processes — requirements engineering [Standard]. Retrieved from https://
www.iso.org/standard/72089.html
Iso/iec 25000:2005. software engineering — software product quality requirements and
evaluation (square) — guide to square [Standard]. (2014).
ISO/IEC 26550: 2015-software and systems engineering–reference model for product
line engineering and management [Standard]. (2015).
Istio. (2018). Istio: An open platform to connect, manage, and secure microservices.
https://istio.io/.
Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakr-
ishnan, R. & Shahabi, C. (2014). Big data and its technical challenges [Journal
Article]. Communications of the ACM, 57(7), 86-94. doi: 10.1145/2611567
Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ra-
makrishnan, R. & Shahabi, C. (2014). Big data and its technical challenges.
Communications of the ACM, 57(7), 86–94.
James, W. (1981). Pragmatism: A new name for some old ways of thinking (b. kuklick,
ed.). Indianapolis, IN: Hackett.(Original work published 1907).
Jin, S., Lin, W., Yin, H., Yang, S., Li, A. & Deng, B. (2015). Community structure
mining in big data social media networks with mapreduce [Journal Article].
Cluster computing, 18(3), 999-1010. doi: 10.1007/s10586-015-0452-x
Jin, X., Wah, B. W., Cheng, X. & Wang, Y. (2015). Significance and challenges of big
data research. Big data research, 2(2), 59–64.
Johnson, M. (2019). Principles of logic in computer science. TechPress.
Josey, A. (2016). Togaf® version 9.1-a pocket guide. Van Haren.
Josey, A., Lankhorst, M., Band, I., Jonkers, H. & Quartel, D. (2016). An introduction
to the archimate® 3.0 specification. White Paper from The Open Group.
JSON Schema. (2019). https://json-schema.org/. (Accessed: April 30,
2023)
Kaisler, S., Armour, F., Espinosa, J. A. & Money, W. (2013). Big data: Issues and
challenges moving forward [Conference Proceedings]. In 2013 46th hawaii
international conference on system sciences (p. 995-1004). IEEE. doi: 10.1109/
hicss.2013.645
Kakivaya, G., Xun, L., Hasha, R., Ahsan, S. B., Pfleiger, T., Sinha, R., . . . others (2018).
Service fabric: a distributed platform for building microservices in the cloud. In
Proceedings of the thirteenth eurosys conference (pp. 1–15).
Kalil, T. (2012). Big data is a big deal. Retrieved from https://
References 397
obamawhitehouse.archives.gov/blog/2012/03/29/
big-data-big-deal
Kallio, H., Pietilä, A.-M., Johnson, M. & Kangasniemi, M. (2016). Systematic
methodological review: developing a framework for a qualitative semi-structured
interview guide. Journal of advanced nursing, 72(12), 2954–2965.
Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., . . .
others (2023). Chatgpt for good? on opportunities and challenges of large lan-
guage models for education. Learning and individual differences, 103, 102274.
Kazman, R., Bass, L., Abowd, G. & Webb, M. (1994). Saam: A method for analyzing
the properties of software architectures. In Proceedings of 16th international
conference on software engineering (pp. 81–90).
Kazman, R., Klein, M., Barbacci, M., Longstaff, T., Lipson, H. & Carriere, J. (1998a).
The architecture tradeoff analysis method [Conference Proceedings]. In Proceed-
ings. fourth ieee international conference on engineering of complex computer
systems (cat. no. 98ex193) (p. 68-78). IEEE.
Kazman, R., Klein, M., Barbacci, M., Longstaff, T., Lipson, H. & Carriere, J. (1998b).
The architecture tradeoff analysis method [Conference Proceedings]. In Proceed-
ings. fourth ieee international conference on engineering of complex computer
systems (cat. no. 98ex193) (p. 68-78). IEEE. doi: 10.21236/ada350761
Khine, P. & Wang, Z. (2019). A review of polyglot persistence in the big data world.
Information, 10, 141. doi: 10.3390/info10040141
Khine, P. P. & Wang, Z. (2019). A review of polyglot persistence in the big data world.
Information, 10(4), 141.
Khrononov, S. (2021). Learning domain-driven design: Aligning your architecture with
the business using context maps, strategic design, and agile techniques. Birming-
ham, UK: Packt Publishing. Retrieved from https://www.amazon.com/
Learning-Domain-Driven-Design-Aligning-Architecture/
dp/1098100131
Kimball, R. & Ross, M. (2013). The data warehouse toolkit: The definitive guide to
dimensional modeling. Wiley.
Kiran, M., Murphy, P., Monga, I., Dugan, J. & Baveja, S. S. (2015a). Lambda
architecture for cost-effective batch and speed big data processing. In 2015 ieee
international conference on big data (big data) (pp. 2785–2792).
Kiran, M., Murphy, P., Monga, I., Dugan, J. & Baveja, S. S. (2015b). Lambda
architecture for cost-effective batch and speed big data processing [Conference
Proceedings]. In 2015 ieee international conference on big data (big data)
(p. 2785-2792). IEEE. doi: 10.1109/bigdata.2015.7364082
Kitchenham, B. A., Budgen, D. & Brereton, P. (2015). Evidence-based software
engineering and systematic reviews (Vol. 4). CRC press.
Kitchenham, B. A., Pfleeger, S. L., Pickard, L. M., Jones, P. W., Hoaglin, D. C.,
El Emam, K. & Rosenberg, J. (2002). Preliminary guidelines for empirical
research in software engineering. IEEE Transactions on software engineering,
28(8), 721–734.
Klein, J., Buglak, R., Blockow, D., Wuttke, T. & Cooper, B. (n.d.). A reference
References 398
architecture for big data systems in the national security domain [Conference
Proceedings]. In 2016 ieee/acm 2nd international workshop on big data software
engineering (bigdse) (p. 51-57). IEEE.
Klein, J., Buglak, R., Blockow, D., Wuttke, T. & Cooper, B. (2016). A reference
architecture for big data systems in the national security domain. In 2016 ieee/acm
2nd international workshop on big data software engineering (bigdse) (pp. 51–
57).
Kleppmann, M. (2017). Designing data-intensive applications: The big ideas behind
reliable, scalable, and maintainable systems [Book]. " O’Reilly Media, Inc.".
doi: 10.1007/978-3-319-77525-8_197
Kohler, J. & Specht, T. (2019). Towards a secure, distributed, and reliable cloud-based
reference architecture for big data in smart cities. In Big data analytics for smart
and connected cities (pp. 38–70). IGI Global.
Kreps, J. (2014a). Questioning the lambda architecture. Blog post.
Retrieved from https://www.oreilly.com/radar/questioning
-the-lambda-architecture/
Kreps, J. (2014b). Questioning the lambda architecture. Online arti-
cle, July, 205. Retrieved from https://www.oreilly.com/radar/
questioning-the-lambda-architecture/
Kreps, J. (2023). Benchmarking apache kafka: 2 million writes per second
(on three cheap machines). https://www.confluent.io/blog/
benchmarking-apache-kafka-2-million-writes-per-second
-on-three-cheap-machines/. (Accessed: 2023-06-05)
Krippendorff, K. (1970). Estimating the reliability, systematic error and random error
of interval data [Journal Article]. Educational and Psychological Measurement,
30(1), 61–70. doi: 10.1177/001316447003000106
Krishnan, K. (2013). Data warehousing in the age of big data. Morgan Kaufmann.
Kubernetes Special Interest Group, S. (2023). kind - kubernetes in docker. Retrieved
from https://kind.sigs.k8s.io/
Laigner, R., Zhou, Y., Salles, M. A. V., Liu, Y. & Kalinowski, M. (2021a). Data
management in microservices: State of the practice, challenges, and research
directions. Proceedings of the VLDB Endowment, 14(13), 3348–3361. doi:
10.14778/3484224.3484232
Laigner, R., Zhou, Y., Salles, M. A. V., Liu, Y. & Kalinowski, M. (2021b). Data
management in microservices: State of the practice, challenges, and research
directions. arXiv preprint arXiv:2103.00170.
Lankhorst, M. (2013). A language for enterprise modelling. In Enterprise architecture
at work (pp. 75–114). Springer.
Lankhorst, M. M., Proper, H. A. & Jonkers, H. (2010). The anatomy of the archimate
language. International Journal of Information System Modeling and Design
(IJISMD), 1(1), 1–32.
Laplante, P. A. (2017). Requirements engineering for software and systems. Auerbach
Publications.
La Rosa, M., van der Aalst, W. M., Dumas, M. & Ter Hofstede, A. H. (2009).
References 399
Maier, M., Serebrenik, A. & Vanderfeesten, I. (2013). Towards a big data reference
architecture [Journal Article]. University of Eindhoven.
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. & Byers, A. H.
(2011). Big data: The next frontier for innovation, competition, and productivity
[Journal Article]. , 3. doi: 10.1186/s40537-015-0014-3
March, S. T. & Smith, G. F. (1995). Design and natural science research on information
technology. Decision support systems, 15(4), 251–266.
Markus, M. L., Majchrzak, A. & Gasser, L. (2002a). A design theory for systems
that support emergent knowledge processes [Journal Article]. MIS quarterly, 43,
179-212. doi: 10.1016/j.dss.2006.09.005
Markus, M. L., Majchrzak, A. & Gasser, L. (2002b). A design theory for systems that
support emergent knowledge processes. MIS quarterly, 179–212.
Márquez, G. & Astudillo, H. (2018). Actual use of architectural patterns in
microservices-based open source projects. In 2018 25th asia-pacific software
engineering conference (apsec) (pp. 31–40).
Marquez, G. & Astudillo, H. (2018). Actual use of architectural patterns in
microservices-based open source projects. In 2018 25th asia-pacific software engi-
neering conference (apsec) (pp. 31–40). IEEE. doi: 10.1109/APSEC.2018.00017
Marr, B. (2016). Big data in practice: how 45 successful companies used big data
analytics to deliver extraordinary results [Book]. John Wiley and Sons. doi:
10.1109/bigdata.2018.8622333
Martinez-Prieto, M. A., Cuesta, C. E., Arias, M. & Fernández, J. D. (2015). The solid
architecture for real-time management of big semantic data. Future Generation
Computer Systems, 47, 62–79.
Marz, N. & Warren, J. (2015). Big data: Principles and best practices of scalable
real-time data systems [Book]. New York; Manning Publications Co. doi:
10.1109/tcss.2020.2995497
McAfee, A. & Brynjolfsson, E. (2012). Big data: the management revolution [Journal
Article]. Harv Bus Rev, 90(10), 60-6, 68, 128. Retrieved from https://
www.ncbi.nlm.nih.gov/pubmed/23074865
McKinsey, G. et al. (2011). Big data: The next frontier for innovation, competition, and
productivity. McKinsey Global Institute, 158-184. doi: 10.7591/9781501734328
-007
Meadows, D. H. (2008). Thinking in systems: A primer. Chelsea Green Publishing.
Mehta, N. & Pandit, A. (2018). Concurrence of big data analytics and healthcare: A
systematic review [Journal Article]. Int J Med Inform, 114, 57-65. Retrieved
from https://www.ncbi.nlm.nih.gov/pubmed/29673604 doi: 10
.1016/j.ijmedinf.2018.03.013
Merriam, S. B. & Grenier, R. S. (2019). Qualitative research in practice: Examples for
discussion and analysis. John Wiley & Sons.
Mertens, D. M. (2008). Transformative research and evaluation. Guilford press.
Mertens, D. M. (2019). Research and evaluation in education and psychology: In-
tegrating diversity with quantitative, qualitative, and mixed methods. Sage
publications.
References 401
Mikalef, P., Pappas, I. O., Krogstie, J. & Giannakos, M. (2018). Big data analytics
capabilities: a systematic literature review and research agenda [Journal Article].
Information Systems and e-Business Management, 16(3), 547-578. doi: 10.1109/
educon.2018.8363273
Miles, M. B. & Huberman, A. M. (1994). Qualitative data analysis: An expanded
sourcebook. sage.
Montesi, F. & Weber, J. (2016). Circuit breakers, discovery, and api gateways in
microservices. arXiv preprint arXiv:1609.05830.
Morris, K. (2016). Infrastructure as Code: Managing Servers in the Cloud. O’Reilly
Media, Inc. ([Print])
Moses, B., Gavish, L. & Vorwerck, M. (2022). Data quality fundamentals: A practi-
tioner’s guide to building trustworthy data pipelines (1st ed.). O’Reilly Media.
Moustakas, C. (1994). Phenomenological research methods. Sage publications.
Muller, G. (2008). A reference architecture primer. Eindhoven Univ. of Techn.,
Eindhoven, White paper.
Munn, Z., Barker, T. H., Moola, S., Tufanaru, C., Stern, C., McArthur, A., . . . Aro-
mataris, E. (2020). Methodological quality of case series studies: an introduction
to the jbi critical appraisal tool. JBI evidence synthesis, 18(10), 2127–2133.
Murdoch, T. B. & Detsky, A. S. (2013). The inevitable application of big data to health
care [Journal Article]. JAMA, 309(13), 1351-2. Retrieved from https://www
.ncbi.nlm.nih.gov/pubmed/23549579 doi: 10.1001/jama.2013.393
Myers, M. D. & Avison, D. (2002). Qualitative research in information systems: a
reader. Sage.
Nadal, S., Herrero, V., Romero, O., Abelló, A., Franch, X., Vansummeren, S. & Valerio,
D. (2017). A software reference architecture for semantic-aware big data systems.
Information and software technology, 90, 75–92.
Nadal, S., Herrero, V., Romero, O., Abelló, A., Franch, X., Vansummeren, S. & Valerio,
D. (2017). A software reference architecture for semantic-aware big data systems
[Journal Article]. Information and software technology, 90, 75-92.
Nakagawa, E. Y., Guessi, M., Maldonado, J. C., Feitosa, D. & Oquendo, F. (2014).
Consolidating a process for the design, representation, and evaluation of reference
architectures. In 2014 ieee/ifip conference on software architecture (pp. 143–
152).
Nakagawa, E. Y., Martins, R. M., Felizardo, K. R. & Maldonado, J. C. (2009). To-
wards a process to design aspect-oriented reference architectures [Conference
Proceedings]. In Xxxv latin american informatics conference (clei) 2009.
Nakagawa, E. Y., Oquendo, F. & Becker, M. (2012). Ramodel: A reference model for
reference architectures. In 2012 joint working ieee/ifip conference on software
architecture and european conference on software architecture (pp. 297–301).
Nakagawa, E. Y., Oquendo, F. & Maldonado, J. C. (2014). Reference architectures.
Software Architecture 1, 55–82.
Nash, H. (2015). Cio survey 2015 [Journal Article]. Association with KPMG.
Nasser, T. & Tariq, R. (2015). Big data challenges. J Comput Eng Inf Technol 4: 3. doi:
http://dx. doi. org/10.4172/2324, 9307(2).
References 402
Voigt, P. & Von dem Bussche, A. (2017). The eu general data protection regulation
(gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing,
10(3152676), 10–5555.
Volk, M., Bosse, S., Bischoff, D. & Turowski, K. (2019). Decision-support for
selecting big data reference architectures. In International conference on business
information systems (pp. 3–17).
Volk, M., Staegemann, D., Trifonova, I., Bosse, S. & Turowski, K. (2020). Identifying
similarities of big data projects–a use case driven approach. IEEE Access, 8,
186599–186619.
Walls, J. G., Widmeyer, G. R. & El Sawy, O. A. (1992). Building an information system
design theory for vigilant eis. Information systems research, 3(1), 36–59.
Wamba, S. F., Gunasekaran, A., Akter, S., Ren, S. J.-f., Dubey, R. & Childe, S. J.
(2017). Big data analytics and firm performance: Effects of dynamic capabilities
[Journal Article]. Journal of Business Research, 70, 356-365. doi: 10.1016/
j.jbusres.2016.08.009
Wang, C. J., Ng, C. Y. & Brook, R. H. (2020). Response to covid-19 in taiwan: big data
analytics, new technology, and proactive testing. Jama, 323(14), 1341–1342.
Wang, H., Xu, Z., Fujita, H. & Liu, S. (2016). Towards felicitous decision making:
An overview on challenges and trends of big data [Journal Article]. Information
Sciences, 367, 747-765. doi: 10.1016/j.ins.2016.07.007
Waseem, M., Liang, P., Ahmad, A., Shahin, M., Khan, A. A. & Marquez, G. (2022).
Decision models for selecting patterns and strategies in microservices systems and
their evaluation by practitioners. In 2022 ieee/acm 44th international conference
on software engineering: Software engineering in practice (icse-seip) (pp. 135–
144). IEEE. doi: 10.1109/ICSE-SEIP55303.2022.9793911
Webster, J. & Watson, R. T. (2002). Analyzing the past to prepare for the future:
Writing a literature review. MIS quarterly, xiii–xxiii.
Weerasinghe, S. & Perera, I. (2022). Taxonomical classification and systematic review
on microservices. International Journal of Engineering Trends and Technology,
70(3), 222–233. doi: 10.14445/22315381/IJETT-V70I3P225
Weiser, M., Brown, J. S., Denning, P. & Metcalfe, R. (1998). Beyond calculation: The
next fifty years of computing. Copernicus.
Weyrich, M. & Ebert, C. (2015). Reference architectures for the internet of things.
IEEE Software, 33(1), 112–116.
Wieringa, R. J. (2014). Design science methodology for information systems and
software engineering. Springer.
Williams, L. G. & Smith, C. U. (n.d.). Pasasm: a method for the performance assessment
of software architectures [Conference Proceedings]. In Proceedings of the 3rd
international workshop on software and performance (p. 179-189).
Wirth, N. (2008). A brief history of software engineering [Journal Article]. IEEE
Annals of the History of Computing, 30(3), 32-39. doi: 10.1109/mahc.2008.33
Wolcott, H. (2008). Ethnography: A way of seeing. 2: nd ed. Walnut Creek, CA:
AltaMira.
Wolfram Research, Inc. (2021). Computational Notebook. https://www.wolfram
References 410
A.1 Introduction
Thanks for your participation. Your opinion is being collected to validate theories in
regards to a domain-driven distributed RA that is called Metamycelium. There are no
right or wrong answers, and the interest is in your opinion and experiences. This process
should take approximately one hour and half depending on the flow of the dialogues.
All your responses will be confidential, and the results of this expert opinion
gathering will be presented without mentioning your name. You may decline to answer
any question or stop the process at any time and for any reason. Should you wish to not
answer any of the questions, you may decline the question. Are there any questions in
regards to what I have just explained ?
Note to the reader/researcher: Please not that this guide aims to only encompass
the main themes being discussed with the expert and as such does not include the
prompts that may have emerged in the process. Some general prompts and close-ended
questions are included.
411
Appendix A. Expert Opinion Guide 412
Before we begin, it would be nice if you could introduce yourself and tell me a bit about
your background and your area of interest.
2. Could you please tell me how many years of professional experience have you
got in software engineering or data engineering? ( 10 years )
1. Could you please tell how many years of experience have you got related to data
engineering or big data? ( 6 years )
2. Could you please elaborate on your experience/s with big data systems ( or any
related systems )? expert
2. How does this architecture align with industry trends and future projections?
2. Are there features or components that differentiate this architecture from existing
solutions?
2. Are there scenarios where this architecture might not be the best fit?
1. Are there any further comments/suggestions/improvements that you have got for
our study?
3. I foresee the Metamycelium architecture being even more relevant in the future
industry landscape.
4. The Metamycelium architecture has distinct features that set it apart from existing
solutions.
8. There might be scenarios where Metamycelium is not the most optimal architec-
ture choice.
10. I believe the Metamycelium architecture can bring substantial benefits to organi-
sations that adopt it.
Appendix B
This framework divides RAs into two major classes: standardisation RAs and facilitation
RAs.
415
Appendix B. Reference Architectures Classification 416
(a) Microsoft Application Architecture for .Net (Press, Joyner & Malcolm,
2002)