Stability and Availability Optimization of Distributed ERP Systems During
Stability and Availability Optimization of Distributed ERP Systems During
1 Introduction
Many customers currently using ERP Solutions on-premise would prefer to
take advantage of the benefits of the cloud, like avoiding hardware cost and
The original version of this chapter was revised: Acknowledgement has been added
below figures 1 and 2. The correction to this chapter is available at
https://doi.org/10.1007/978-3-031-28451-9 50
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023, corrected publication 2023
L. Barolli (Ed.): AINA 2023, LNNS 654, pp. 343–354, 2023.
https://doi.org/10.1007/978-3- 031-28451-9 30
344 G. C. Aloysius et al.
process simple, and cost-effective [7]. For example, AWS has a presence across
regions like North America, South America, Europe, the middle east/Africa, and
the Asia Pacific, with several data centers within these regions. So, to architect
an enterprise solution for a global organization is much simpler in the cloud.
Migrating an SAP landscape cannot be done in isolation. We will still need
the integration between external systems, which will require much tweaking from
both the migrated SAP System and the external system in question. There is a
huge reliance on individuals and no systemic approach to handling this. One of
the important tasks when planning an SAP migration is to ensure SAP inter-
faces are documented. Some of the interfaces may have hardcoded IPs that need
manual correction or moved to hostnames with DNS configured as a best prac-
tice. External interfaces need to be tested, requiring coordination from exter-
nal partners, so it needs to be carefully planned [8]. There have been enumer-
able situations where external interfaces are critical but not captured as part of
the migration. Raising firewall requests and implementing them during a run is
extremely cumbersome and will directly affect availability since a non-working
functionality is still a downtime from a customer standpoint.
After several years of having a solution On-Premise with substantial money
invested in tools and solutions, customers get tied up to a solution without ques-
tioning its efficiency vs. other available technology options. Moving to the cloud
reopens this conversation; instead of carrying over their old backup and recovery
and monitoring solution, it makes much sense to work with the cloud vendor and
identify tools and solution that gives better performance and reliability [9]. In
addition, cloud vendors usually support multiple customers with a similar setup,
so they should have the performance metrics of these tools handy [10].
This paper proposes a method to ensure stability, availability, and optimum
response in the migrated system for a typical “Lift and shift” scenario with or
without Unicode conversion. The proposal significantly improves the stability,
availability, and performance of the SAP ERP environment.
Fig. 1. Design a high availability solution using AWS single region architectural pat-
tern: adapted from Pattern 1: A single Region with two AZs for production mentioned
in [12]
Stability and Availability Optimization of Distributed ERP Systems 349
SAP HA Components
As shown in Fig. 2, the SAP production system is built across two separate data
centers (let us call them DC1 and DC2) in Paris (Region) as active/passive. The
compute deployed for the productive Database, SAP central services, and applica-
tion servers are of the same type on both data centers. SAP Central services (SAP
ASCS) is a single point of failure, so it will be replicated by building Active central
services in DC1 and a passive central service in DC2. Central services also hold
the application locks through the Enqueue service; this is a single point of failure
and will be addressed by building an active Enqueue replication service in DC2. In
the event of a failure, the cloud network load balancer will relocate the Enqueue
service to the active enqueue replication service. This will ensure the locks are
not lost in the process. As noted here, we are not leveraging any OS-level cluster
solutions for auto-failover but rather a replication solution used in tandem with
Fig. 2. SAP HA components diagram Adapted from “SAP ASCS High Availability
using ERS explained” https://blogs.sap.com/2021/10/28/sap-ascs-high-availability-
using-ers-explained/
350 G. C. Aloysius et al.
the cloud network load balancer to move between nodes manually. The solution
deployed in a Single region - Multiple data center architecture with redundancy
built within the region. If we need an auto-failover to datacenter2, we can add a
cluster solution into the mix. There are specific services like FTP, SFTP, and spe-
cific ports which can be set up for redirection in the cloud network load balancer.
This can be used as an additional application service load balancing tool along
with sap logon groups. The load balancing between the application servers is man-
aged using a standard SAP logon group solution (SMLG). Redundant application
servers will be available in both data centers and will serve as a backup to each
other. Database instances will be replicated by a synchronous replication setup
between the databases in both data centers. Logs from the primary database are
shipped across the network and applied to the secondary database in real time.
So if the primary database fails, the secondary is ready to handle the load. This
requires reliable network availability between the two data centers.
Assumptions in the Architecture We require a minimum availability of
99.9%. This will require the system to be replicated to another node within the
same region. Since the zones are within the Paris region, we will use a Sync
Task type name # steps Time Avg. proc. time CPU time DB time Wait time
ALE 210,570 59,6 32,0 9,6 14,9 0,3
AUTOBAP 22,746 1.364,7 599,6 266,5 599,7 1,9
AUTPCCMS 113,618 2,7 2,6 0,3 0,0 0,1
AUTOTH 101,705 2,6 1,3 0,4 0,0 1,3
BACKGROUND 3204,851 6.660,6 2.291,2 931,2 4.262,9 0,0
BUFFER SYNC 56,846 11,7 3,2 1,9 8,4 0,1
DOLOG CLEANUP 56,846 14,1 2,1 0,3 12,0 0,1
DEL. THCALL 581 235,2 173,4 48,5 61,3 0,5
DIALOG 3986,076 848,2 144,9 98,2 441,6 0,8
HTTP 307,392 61,4 17,7 8,1 18,2 0,6
OTHER 145 15.206,0 157,8 140,8 2,5 0,2
RFC 16228,825 2.259,5 433,7 59,9 351,9 0,9
RPCTH 1,861 94,7 49,0 34,6 45,7 0,0
SPOOL 530,113 198,5 128,8 25,2 28,0 38,5
Stability and Availability Optimization of Distributed ERP Systems 351
3 Conclusion
The proposed method to migrate workloads from different regions has consis-
tently resulted in sustained stability and availability of the ERP system, showing
significant improvement in the application performance. These migrations were
of a “Life and shift” nature, but the underlying principles can be applied across
any migration of productive workload to the cloud. Gartner Says More Than
Half of Enterprise IT Spending in Key Market Segments Will Shift to the cloud
by 2025, which translates to a significant workload moving to the cloud in the
coming years. The lessons learned and the proposed method can serve as a start-
ing point to make the project, investment, and technical decisions for companies
venturing into moving their productive workload to the cloud.
References
1. A Forrester Total Economic Impact™ study commissioned by Amazon. https://
pages.awscloud.com/Amazon Connect Forrester TEI Report.html. Accessed 06
Jan 2023
2. Subramanian, S.: Whither Goest Thou, Enterprise Workloads? IDC (2020).
https://blogs.idc.com/2020/03/02/whither-goest-thou-enterprise-workloads/.
Accessed 04 Jan 2022
354 G. C. Aloysius et al.