Methods and Algorithms For Assessing Computer Network Performance
Methods and Algorithms For Assessing Computer Network Performance
6, November 2024
ABSTRACT
The article proposes an approach that makes it possible to obtain an assessment of the reliability of a data
transmission system, containing a description of computational circuits, algorithms, and models for
assessing various aspects of reliability: the method for calculating operability is presented for the first
time, although it has already become widespread in the design of special-purpose systems.
KEYWORDS
Reliability assessment, operability, system reliability, computational schemes, algorithms, evaluation
model.
1. INTRODUCTION
Quite often, in the works [1,3,4,5,6], to determine the reliability of computer systems, as
properties to provide communication, while maintaining the values of established quality
indicators in time under specified operating conditions, not only and not so much the
characteristics of the condition of technical means, their processing for failure, but also the
number of user requirements for message reliability indicators, the probability of timely delivery
began to be included messages, i.e. due to a wide range of user requirements for reliability
indicators. taking into account all the stated requirements at the design stage of computer systems
can be implemented by modeling the functioning of the system taking into account failures –
restoration and evaluation of the technical condition of the system identified with network
connectivity (structural reliability). The connectivity property does not provide the necessary
degree of detail about all possible network states at the current time, allows you to operate only in
two categories (whether the connectivity requirements are met or not), without determining how
“badly” connectivity is broken in case of non-fulfillment of the specified requirements. This is
necessary for a more detailed assessment of the state of the computer network. The step-by-step
significance of the task of determining the technical condition of the network is increasing.
Almost until recently, the definition of “bottlenecks” of the network, in which there is a high
probability of failure of elements that will worsen its functioning or lead to a complete failure of
the system, was based on the well-known provisions of the theory of reliability, including the
concept of a complete failure of the system in case of failure. At the same time, even in the
presence of equipment failure or disconnection of part of the system elements, it is possible to
transmit and receive information. Known methods of assessing the reliability of computer
systems are not taken into account. The step-by-step introduction [2] of conceptual definitions of
the technical condition of objects and systems, as well as corresponding quantitative indicators,
made it possible to correctly solve a large number of network operation tasks.
DOI:10.5121/acij.2024.15601 1
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
𝑞
𝑊𝑖𝑗 = ⋃𝑚
𝑞−1 𝑃𝑖𝑗 , (1)
2
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
As highlighted earlier, the paramount concern for any user is to ascertain the network's state as
𝑍𝑝С which serves as both a necessary and sufficient condition for ensuring seamless operation
across all modes. This entails analyzing the network's visibility from the perspective of
categorizing its states into two distinct classes: operability and network inactivity. The listed
states can be described using Boolean variables. Each state of the 𝑃𝑖𝑗 path corresponds to a
𝑃
Boolean variable 𝑍𝑝 𝑖𝑗 , whose valid implementations are 1 or 0, i.e.
In turn, the state of each path is determined by the state of the network elements entering it. Each
i-th element of the network corresponds to a Boolean variable 𝑃𝑖𝑗 , valid implementations of
which are 1 or 0, i.e.
That state of the network is uniquely determined through the known state of its elements forming
𝑖𝑗
this path. To provide a more nuanced assessment of the network state, the indicator 𝛼𝑘 (𝑡) is
introduced. This indicator represents the level of network operability at a given time, enabling the
determination of operability from the perspective of the i-th user interacting with the j-th users. It
is defined as follows:
П𝑙𝑖𝑗
𝛼𝑘𝑖𝑗 (𝑡) = 𝑚
. (5).
Where: k is the index of the level of working capacity, m is the number of possible paths between
the i-th and j-th users.𝑃𝑖𝑗𝑙 , 𝑙(𝑙 = 1,2, … , 𝑚) – the number of operable paths existing at a given
time between the i–th and j-th corresponding users. For any moment of time t ≤ 0, the state of the
3
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
network Z(t) is interpreted as a random variable. The progression of the network state over time
can be construed as a stochastic process {Z(t), t=0} characterized by a finite set of states.
Consequently, the quantitative assessment of network health from the perspective of two users at
any given time can be computed using equation (5). Serviceability is the condition in which the
object meets all the established requirements. Malfunction is a condition in which an object does
not meet at least one of the established requirements. Operability denotes a state where an object
can execute predetermined functions while adhering to specified parameter values and
established limits. Conversely, inoperability occurs when at least one specified parameter fails to
meet requirements despite the object's ability to perform designated functions. Proper functioning
describes a state wherein an object fulfills all regulated functions required at the current time,
while maintaining parameter values within predefined limits. On the other hand, improper
functioning arises when an object fails to perform some regulated functions necessary at the
present time and/or does not maintain specified parameter values within established limits. Based
on these technical condition definitions, it's evident that: - In a state of serviceability, the object
remains operational. - In a state of operability, the object functions correctly across all modes. -
In a state of improper functioning, the object is inoperable and defective. Moreover, a properly
functioning object may still be inoperable and defective. A functional object may also be
defective. Let's consider the events leading to the transition from one technical condition to
another. The event (H) that leads to the transition of an object: - from Zu to Zu̅ is called damage
(Нп); - from 𝑍р to 𝑍р̅ is called damage (Но); - from ZPF to ZPF ̅̅̅̅ is called damage (Нн). The
transition process: - from Zu to Zu̅ is called the “Restoration of serviceability” of Bu; - from
Zр to Zр̅ we will call the “Restoration of operability” of Вр; - from Zр to Zр̅ we call “Restoration
of proper functioning” Вpf. To determine the types of technical condition of the communication
network, based on the general technical concepts of the technical conditions of the object,
secondary concepts are introduced, such as: - network resource, a set of tools necessary to
perform one of the network functions; - single resource, the amount of resource determined by
the minimum amount of function performed for a given system (for example, a single amount of
memory, a single bandwidth); a single connection, denoted as 𝑆𝑖𝑗 (𝑖, 𝑗 = 1,2, … , 𝑁; 𝑖 ≠ 𝑗), exists
within the network framework, where N represents the total number of users. This connection
comprises a sequential arrangement of individual network resources proficient in executing all
tasks associated with the data delivery process between the i th and j th users, characterized by
p
unified quality indicators. - connection Sij , the minimum set of single connections capable of
delivering data from i and j to the user with the specified destination indicators for the p–th mode
((p = ̅̅̅̅̅
1, p, where p – is the number of delivery modes). The connection is characterized by the
following parameters: a) length:
𝑝
𝑙(𝑆𝑖𝑗 ) = ∑𝑘𝑞=1 𝑙 (𝑈
⏟ 𝑞 ). (6).
𝑝
𝑈𝜖𝑆𝑖𝑗
where l(U) – is the length of the communication line between the plate nodes; K - the number of
communication lines encompassed within this connection. b) Time of existence:
p p p p
t (Sij ) = t yer (Sij ) + t coxp (Sij ) + t зав (Sij ). (7).
p
c) Bandwidth: μ (Sij ); d) priority of service; e) broadcasting capability (multi-targeting); f) the
discreteness of the input; g) reliability – the probability of implementation and the probability of
connection restoration. Path Пij (or a set of connections), the minimum set of network resources
that allows you to organize several connections, in any necessary combination of their types
between the i-th and j-th users, i.e.
4
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
Note that the path is characterized by the same parameters as the connection. Due to the ultimate
reliability of network resources, as well as due to the need to ensure a given probability of timely
delivery by i - th and j – th users, in addition to the main path, it provides backup paths that
together make up a set of paths. The set of paths Wij is the set of all existing or possible paths
between the i-th and j-th users.
𝑞
𝑊𝑖𝑗 = ⋃𝑚
𝑞=1 П𝑖𝑗 . (9).
where m-is the number of paths. Wij is characterized by the number of paths and the probability
of the existence of at least one workable path between the i-th and j-th users. Then the
communication network can be represented as L sets of paths, where L-is the number of user
pairs. The minimum number of independent paths between any two users is referred to as
network connectivity, denoted as
The serviceability of the set of paths Wij is characterized by the presence of all operable Pij .
Failure of any 𝑃𝑖𝑗 leads to a transition to the fault state. The operability of 𝑊𝑖𝑗 is characterized by
the presence of at least one operable 𝑃𝑖𝑗 , with a partial failure of which 𝑊𝑖𝑗 goes into a state of
inactivity, and with a full one – into a limiting state. The operability of 𝑃𝑖𝑗 is characterized by the
presence of all operable connections. Failure of any connection leads to the transition of 𝑃𝑖𝑗 to the
state of inactivity. The failure of the last of the existing path connections (complete path failure)
p
leads to the transition of 𝑃𝑖𝑗 to the limit state. The operability of Sij is characterized by the
presence of all single connections, the failure of any of them leads to a transition to a state of
inactivity. The definitions of the types of technical conditions of the object discussed above and
the introduction of secondary concepts allow us to define the types of technical conditions of a
computer network. Let's define the technical conditions of the network from the perspective of
the i-th user corresponding with the j-th user (where i, j=1,2,…, N; i≠j), where N-is the number of
users. Network health, denoted Zuc , represents the state of the network characterized by the
presence of all m possible paths 𝑃𝑖𝑗 between the i-th and j-th users, depicted as a set of paths
𝑞
𝑊𝑖𝑗 = ⋃𝑚
𝑞=1 𝑃𝑖𝑗 . (11).
Network malfunction Zuc – is a network condition in which at least one connection is inoperable.
The occurrence that triggers the shift from the state of Zuc to the state of Zuc̅ is termed network
damage, denoted as Нcп. Network operability, represented as 𝑍𝑢𝑐 , indicates the state of the network
characterized by the existence of at least one path 𝑃𝑖𝑗 between the i-th and j-th users. Network
inactivity Zрc is a network condition in which the last 𝑃𝑖𝑗 path of q (q = 1,2, … , m) possible paths
𝑐
between the i-th and j-th users fails. The event causing the transition from the state of 𝑍𝑃𝐹 to the
𝑐 c
state of 𝑍̅̅̅̅
𝑃𝐹 is termed a network failure, denoted as Н н . The state of proper functioning over the
established connection, referred to as Zрc 0, signifies the state of the network where data delivery
c
is accurately and promptly guaranteed. The condition of improper functioning Z̅̅̅̅ PF is a network
condition in which error-free and/or timely data delivery is not provided. The event leading to the
𝑐 𝑐 c
transition from the state of 𝑍𝑃𝐹 to the state 𝑍𝑃𝐹
̅̅̅̅ will be called a violation of Нн . Since for any
user, as already noted above, the greatest interest is finding the network in the state Zрc 0, which is
a necessary and sufficient condition to ensure proper functioning in all modes, we will analyze
5
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
the type of technical condition of the network from these positions, i.e. classify the network states
into two subsets of states network operability or inactivity
ij
𝑍 𝑖𝑗 𝜖 𝑍𝑝𝑖𝑗 or 𝑍 𝑖𝑗 𝜖 𝑍𝑝̅𝑖𝑗 and Zij ⋂ Zp̅ = ∅. (12).
The network's state is uniquely determined by the states of the set of paths (Wij ), and the state of
Wij , in turn, is determined by the states of the paths between the corresponding i-th and j-th users.
The considered basic concepts of the computer system operation process allow us to draw the
following conclusions: - to formulate the task of general and technical operation of a computer
system in a specific formulation; - identify user classes depending on the set of required delivery
modes; - identify the necessary computer system resources to meet the needs of users; -to develop
an algorithm for managing computer system resources.
6
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
and priority. At a random moment in time, the system receives messages (requests) for
maintenance. Since the input stream is Poisson, the time of receipt of the k - th message is
determined recursively by the following equation: The equation
describes the iterative process where: 𝜆вх represents the intensity of message reception; 𝜉𝑘 - are
random numbers uniformly distributed in the interval (0,1). At the input of the system, 𝑞∗ (𝑞 =
1,2, … , 𝑞∗ ) incoming message streams are received. Messages are unequal in importance:
messages from the q-th stream are more important than messages from the 𝑞 + 1-st stream, and
the latter, in turn, are more important than messages from the 𝑞 + 2-nd stream, etc. (note that all
messages in the stream are equivalent). Messages are “lined up” according to priorities and in the
order of receipt. The affiliation of each message to a particular priority is determined randomly
from the proportion of messages of each priority in the total stream. The messages are serviced
by 𝑆 ∗ uniform channels, respectively numbered (1,2,3, … , 𝑆, … , 𝑆 ∗ ).. The system serves incoming
messages in order of importance, so that messages with the lowest priority number from among
those in the queue always arrive in the vacant channel. If a low priority request has been received
for maintenance in any channel, then its maintenance continues to the end even if messages of
higher priorities are received by the system during its maintenance, i.e. there is a service with
relative priority (without interruption). The transmission rate of the message obeys the
exponential distribution law with a value of
The considered model takes into account the failure (failure) and restoration of service channels.
Each channel has an inherent uptime, which is a random variable with a corresponding
distribution law. The duration of channel recovery after failure is also a random variable
governed by a specified distribution law. If the channel fails at the moment when another
message is being serviced in it, then this message is returned to the beginning of the queue of the
appropriate priority. Simultaneously, it's possible for the same message to enter both the service
area and the waiting area repeatedly. The model takes into account the following message service
failures: - due to queue overflow; - due to exceeding the waiting time for maintenance. Given the
significance of the system time advancement mechanism in constructing such models, let's
explore potential methods for establishing system time. The model's operation should occur in
artificial time, guaranteeing event occurrence in the correct sequence and with appropriate time
intervals between them. As events may occur simultaneously in various parts of a real system, it's
essential to develop a time-setting mechanism to synchronize the actions of system components
within a specified time interval. There are two main methods for setting time: - the fixed time
step method, in which the system time is counted at predetermined time intervals of constant
length; - the method of the step to the next event. When using which, the state of the simulated
system is updated with the occurrence of each significant event, regardless of the time intervals
between them. Each of these methods has its advantages and disadvantages. For instance, the
method of advancing to the next event eliminates the need to designate an arbitrary artificial time
increment. This avoids the danger that the time increment value selected without the user's
knowledge will change the simulation results. In addition, in this case, events are considered and
served as simultaneous only if they are marked with the same time of occurrence. On the other
hand, the fixed-step method works better if many events occur during the simulation cycle, and
the mathematical expectation of the duration of events is low, as well as when the exact nature of
significant events is not clear, as, for example, it happens at the initial stage of the study. When
constructing a model of a computer network's operation, the approach of stepping to the next
event is chosen. This is mainly due to the fact that more accurate results are obtained, and there is
also no need to determine the magnitude of the time increment. The following significant events
7
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
are selected in the model: the receipt of a message into the system and the release of
communication channels. The promotion of the system time T in the model is performed as
follows. At the initiation of the simulation, the variable T is initialized to zero, denoted as T=0.
Subsequently, the time of arrival for the first message is generated, and T is updated accordingly
1
at this juncture. 𝑇 = 𝑡пост . All subsequent promotions of T depend on checking the condition:
which of the two nearest events – the receipt of a message or the release of a data channel, will
happen earlier, i.e.
4. A random number n is formed in relation to the law of distribution of the incoming flow
(failures and recoveries). 5. Formation of the timestamp for the next state change (either failure
or recovery) of the element is expressed as:
𝑡𝑘 = 𝑡𝑘−1 + 𝜂 . (17).
where 𝑡𝑘 represents the time of occurrence of the subsequent event. 𝑡𝑘 the moment of receipt of
the previous event. 6. Defines the number of existing paths 𝑃𝑖𝑗 (𝑡𝑘 ) between all users, at 𝑡𝑘 point
in time. 7. Defines the number of existing 𝑃𝑖𝑗 (𝑡𝑘 ) paths between all users, at 𝑡𝑘 point in time. 8.
Determines the degree of network operability, from the position of the i–th user corresponding
with the 𝑡𝑘 user:
𝑃𝑖𝑗 (𝑡𝑘 )
𝛼𝑖𝑗 (𝑡𝑘 ) = . (18).
𝑚𝑖𝑗
9. Defines 𝛼с -the level of network operability as a whole (from the point of view of maintenance
personnel) at a given time
10. A check is performed to see if the moment 𝑡𝑘 – the arrival of a random event for a given
simulation interval has been exceeded 𝑡𝑘 < 𝑇𝑀 . If the condition is met, then control is transferred
8
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
to step 4. If the condition is met, i.e. the simulation interval is exhausted, you should proceed to
processing the simulation results, step 11. 11. Analysis results.
4. CONCLUSIONS
The reliability of modern computer networks is a complex property characterized by taking into
account a wide variety of parameters for its definition. These parameters include structural and
operational characteristics of the service. Existing methods of performance assessment: a)
analytical – are not suitable for multi-pole systems; b) statistics take into account only failures
and recoveries of technical elements of the communication network. The statistics take into
account only failures and recoveries of technical elements of the communication network. Many
available calculation methods, as a rule, reliability assessment is carried out according to one of
the parameters (for example, network connectivity, time to failure, cost of element restoration,
etc.). This is not entirely true, because in real networks, if some of its elements fail, the network
can perform its functions. The operability of a computer system - is a new reliability assessment
indicator that takes into account the performance of its functions by the system. The purpose of
this study is to create an automation system for evaluating and improving reliability, taking into
account the structure of the network properties and functional characteristics. The computation of
the principal functional attributes of a hierarchical computer network, assuming arbitrary
message transmission flow and predetermined service order, is conducted using a multi-phase
queuing system (QS) model. Each phase of this model is depicted as a QS of the 𝑀 ⃗⃗ 𝑘 /𝑀/𝑛/𝑟 <
∞. type. The structural reliability of the network can be estimated as the average proportion of
connections between the elements of the network graph, which is preserved while its arbitrary
elements are damaged. The proposed method of assessing the operability has no restrictions on
the number of corresponding pairs in the network, it allows you to track the dynamics of network
operability. As a criterion of “operational” reliability, i.e. an assessment of the necessary costs
for the reliable functioning of the communication network, the indicator “reduced costs” is
justified as reflecting the structural and technical characteristics of the communication network.
The developed automation system for evaluating the reliability of a computer network for the
first time allows us to obtain the dependence of functional characteristics on the level of
operability, structural characteristics. This makes it possible, almost for the first time, to have the
dynamics of changes in the system's operability, taking into account failures-restoration of
elements on the one hand and their influence (failures) on the characteristics of the system's
functioning on the other. This conclusion is clearly confirmed by a computational experiment.
Three schemes for improving the reliability of a computer network are presented. These are the
restoration of network elements, the introduction of a reserve and a combined scheme. The
selection of each particular scheme is made considering the costs necessary for its
implementation. Consequently, the aforementioned costs are taken into account. A computational
experiment conducted on a computer network with a specified configuration validates and
demonstrates the viability of the proposed approach for evaluating and enhancing reliability. The
analysis of the current state of the reliability problem on computer networks carried out in this
study shows that in determining reliability as a property to ensure communication, while
maintaining the values of established quality indicators in time under specified operating
conditions, not only the characteristics of the state of technical means, their operating time, but
also many user requirements for indicators of reliability of message delivery, reliability and the
error-free transmission of messages, the probability of timely delivery of messages. Of course,
taking into account these requirements does not fit into the schemes of traditional reliability
calculation methods and requires a new, systematic approach to the formation of a
comprehensive reliability assessment, which is the subject of this work, the scientific and
practical results of which are as follows: 1. A systematic approach to assessing the operability of
a computer network is proposed, including its functional, structural, and operational aspects for
9
Advanced Computing: An International Journal (ACIJ), Vol.15, No.6, November 2024
networks of arbitrary configuration, which makes it possible to provide controls with objective
and reliable information about the state of a computer network. The need to calculate the
structural parameters and performance indicators of a computer network is caused by the
following reasons: the need to assess the state of the network in order to make a decision on
network management in conditions of damage to elements and the need to evaluate intermediate
network options at the stage of its synthesis. 2. A new method for calculating the structural
reliability of complex multifunctional structures, such as computer networks, is proposed. The
structural reliability of the network can be estimated as the average proportion of connections
between the elements of the network graph, which is preserved while its arbitrary elements are
damaged. 3. A method is proposed for assessing the technical condition of a computer network
based on assessments of the operability of both individual corresponding nodes and the entire
network. The method imposes no limitations on the number of matching pairs in the network and
enables monitoring the dynamics of network performance. 4. The criterion for estimating the cost
of network maintenance was reasonably chosen, based on the assumptions that measures to
restore elements should be the simplest and the cost of diagnostic tools should be minimal. 5. The
structure and management principles of the automation system for obtaining a comprehensive
health assessment have been developed, including blocks for calculating the functional
characteristics of health, assessing the level of system performance, mechanisms for improving
system performance, in case of failure of its elements. 6. The dependence between the functional
characteristics of reliability and the level of its operability is obtained. This allows you to identify
the risk zones of the system, putting it into an inoperable state. Therefore, a method is introduced
for evaluating the technical condition of a computer network by assessing the operability of
individual correlating nodes (pairs) as well as the overall compatibility of the network (users).
This method imposes no restrictions on the number of correlating pairs in the network, enables
monitoring the network's performance dynamics, and consequently serves as a convenient tool in
synthesizing network maintenance solutions. Moreover, it emerges as a key component in the
design of future digital systems.
REFERENCES
[1] Половко А.М. & Гуров С.В., (2006) “Основы теории надежности”, БХВ-Петербург Publishers,
704 p.
[2] Захаров Г.П. & Захаренко Г.П. (1989) “Детерминированная модель оценки живучести и
уязвимости сетей”, АН СССР Publishers, Техническая кибернетика, No. 2.
[3] Громов Ю.Ю., (2010) “ Надежность информационных систем ”, ГОУ ВПО ТГТУ Publishers, 160
p.
[4] Гузик В.Ф. & Самойленко А.П., (2008) “ Принципы проектирования интегральной модели
оценки надежности информационно-вычислительных систем”, ЮФУ. Технические науки
Publishers, pp 36-39.
[5] Василенко Н.В. & Макаров В.А., (2004) “ Модели оценки надежности программного
обеспечения ”, Вестник Новгородского государственного университета Publishers, No. 2 pp 126–
132.
[6] Чекал Е.Г. & Чичев А.А., (12) “Надежность информационных систем”, УлГУ Publishers, 118 p.
[7] Н.Рахимов, &, О. Примкулов (2023) “ Ахборот тизимларида мантиқий хулосалаш
самарадорлигини ошириш ёндашуви”, International Scientific and Practical Conference on
Algorithms and Current Problems of Programming. Pp 56-59
AUTHOR
Aziz Ishmukhamedov
10