International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
DOI:10.5121/ijfcst.2015.5106 59
DESIGN AND ANALYSIS OF RA SORT
Harsh Ranjan1
, Sumit Agarwal1
and Niraj Kumar Singh*1
1
Department of Computer Science and Engineering,
Birla Institute of Technology, Mesra, India
ABSTRACT
This paper introduces a new comparison base stable sorting algorithm, named RA sort. The RA sort
involves only the comparison of pair of elements in an array which ultimately sorts the array and does not
involve the comparison of each element with every other element. It tries to build upon the relationship
established between the elements in each pass. Instead of going for a blind comparison we prefer a
selective comparison to get an efficient method. Sorting is a fundamental operation in computer science.
This algorithm is analysed both theoretically and empirically to get a robust average case result. We have
performed its Empirical analysis and compared its performance with the well-known quick sort for various
input types. Although the theoretical worst case complexity of RA sort is Yworst(n) = O(n√ ), the
experimental results suggest an empirical Oemp(nlgn)1.333
time complexity for typical input instances, where
the parameter n characterizes the input size. The theoretical complexity is given for comparison operation.
We emphasize that the theoretical complexity is operation specific whereas the empirical one represents the
overall algorithmic complexity.
KEYWORDS
Algorithm, quick sort, RA sort, Theoretical Analysis, Empirical Analysis, Uniform Distribution Model,
Poisson Model, Binomial Model.
1. INTRODUCTION
This paper introduces a new comparison base stable sorting algorithm, named RA sort. Though
many sorting algorithms have been developed, no single technique is best suited for all
applications. In basic comparison sort algorithm we need to check each element with the rest of
the array in order to find its appropriate location. In RA sort we only need to compare selective
pairs of whose elements are distant apart in a defined manner. This selective comparison among
elements saves a fair amount of time and comparisons.
Although the theoretical worst case complexity of RA sort is Yworst(n) = O(n√ ), the experimental
results reveal that with Oemp(nlgn)1.333
time complexity for typical inputs it can perform optimally.
2. ALGORITHM RA SORT
The RA Sort algorithm involves only the comparison of pair of elements in an array which
ultimately sorts the array and does not involve the comparison of each element with every other
element. This algorithm tries to build upon the relationship established between the elements in
each pass. For an input (a1, a2, a3) let a1<a2 and a2<a3, then it can be easily inference that a1<a3 and
so there is no need to compare a1 and a3. The algorithm uses this technique to place each element
in their appropriate location by saving significantly large number of comparisons.
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
60
The RA sort first determines the minimum length such that all elements get placed in their
appropriate locations. This length refers to the maximum forward distance a particular index can
be compared with. It starts with this length and goes down to one at each point comparing every
element only with one element that is a fixed length forward to it. This minimum value of length
can be easily found out by binary search as for all values greater than this length the array will be
sorted and for all values less than it, it will be partially sorted. If there is an array of four elements
then initially a1 needed to be compared with a2, a3, a4 and same goes for a2, a3 and a4 before
completing the sorting but if length equals two then in first loop a1 is compared only with a3 and
a2 is compared only with a4 and in the second loop when length decrements by one a1 gets
compared only with a2, a2 gets compared only with a3 and a3 gets compared only with a4. These
comparisons takes place one by one and at each point the value at an index might change and the
updated value at that index gets used for future comparisons. Thus in only five comparisons when
length equals to two and three when length equals to one we have sorted the entire array instead
of a total of twelve. This differences increases greatly as the size of array increases.
This minimum length does not have a general formula which can be given for all input size but a
rough estimate can be made which gives the minimum value for most of the cases and for few
cases it gives a slightly higher value which ultimately does the sorting job perfectly. Let us denote
this minimum values by K. Then K=T *lgn where T =
√
∗
Derivation of T: Let n denote the input size of a sample and h equals to lgn. Maximum jump
required by any element to go to its correct position = n-1 (smallest element is at the last position
or largest at first.) After x iteration maximum jump that can be made by any element from its
given location by RA sort is 1+2+3+…+x. Multiplying x by h and summing the above series we
get (x*h)*(x*h+1)/2. Now this value needs to be greater than n-1 so that every element can reach
its appropriate location in worst case. On comparing them: (x*h)*(x*h+1)/2≥n (Replacing n by n-
1 for calculation ease.) Considering x*h=z, we have:
z*(z+1) ≥ 2*n
 z2
+ z-2 * n ≥ 0
 z = (-1+√1 + 8 )/2, andsince z=x*h, we get
x = [(1 + 8 ) − 1]/2 .
Thus T=⌈ ⌉, for covering boundary cases at some places. On solving this quadratic relation for x
since h is a constant gives the required formula for T as T=
√
∗
.The minimum length is
given as T*lgn. It can be seen that in general case any length less than this can’t sort the array
totally as each element would not end up at their appropriate location and every length greater
will.
Algorithm: RA_Sort (A, n):
//Here A [0: n-1] is the input array of size n
FOR (I=T*⌈ ⌉; I≥1; I=I-1) // T =
√
∗
FOR (J=0; J+I< n; J=J+1)
IF (A[J]>A [J+I])
Exchange (A[J], A [J+I])
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
61
Analysis is an integral part of any algorithm which gives an idea of the resources to be consumed
during its implementation. The RA sort algorithm is analyzed both theoretically and empirically.
The theoretical analysis is well suited to obtain worst case results as it in some sense gives a
guarantee in terms of performance. At the same time this guarantee can also be conservative. In
such cases a certificate on the level of conservativeness is necessary and is given by the novel
concept of empirical O discussed in this paper. We have done theoretical analysis over
comparison operation to get its worst case performance in terms of big-oh notation. Average case
analysis is done using statistical bound estimate (also called empirical-O). Reader is suggested to
see the references [1] and [2] to get insight into empirical-O.
2.1.1. Theoretical Analysis (Worst Case Only)
Referring to the pseudo code (it contains two for loops) of RA sort its runtime complexity is
expressed as the following summation equation:
Y(n) = ∑ 1 ∑ (1)
The worst case equation is: Yworst(n) = O(T*lgn*n) = O(n√ ), which is obtained by substitutingT
=
√
∗
.
2.1.2. Empirical Analysis (Average Case Only)
This section includes empirical results obtained for average case analysis of RA and quick sorts.
The algorithm was run for data obtained from various uniform and non-uniform discrete
distribution data model like Uniform distribution, Poisson distribution and Binomial distribution.
The performance of RA sort is also compared with standard version of quick sort algorithm [3]
for the similar input types. The observed mean time (in sec) of 1000 trials was noted in table (1).
Average case analysis was done by directly working on program run time to estimate the weight
based statistical bound over a finite range by running computer experiments [4] and [5]. This
estimate is called empirical O [1] and [2]. Here time of an operation is taken as its weight.
Weighing permits collective consideration of all operations into a conceptual bound which we
call a statistical bound in order to distinguish it from the count based mathematical bounds that
are operation specific. The way we design and analyze our computer experiment has certainly a
great impact on the credibility of empirical-O. See reference [2] for more insight into the
philosophy behind statistical bound and empirical-O. The statistical analysis and the various
interpretations are guided by [6].
The samples are generated randomly, using a random number generating function, to characterize
discrete uniform, poisson, and binomial distribution models with k, λ, and (m, p) as its respective
parameters. Our sample sizes lie in between 1*105
and 20*105
.
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
62
Table 1. Observed mean times in second(s) for Pair and quick sorts
RA SORT QUICK SORT
N DU
k = 50
Poisson
λ = 4
Binomial
(400, 0.5)
DU
k = 50
Poisson
λ = 4
Binomial
(400, 0.5)
100000 0.203 0.187 0.187 0.218 1.422 0.125
200000 0.531 0.515 0.531 0.812 5.688 0.421
300000 0.969 0.937 1.000 1.811 9.883 0.938
400000 1.500 1.508 1.515 3.213 14.313 1.641
500000 2.112 2.161 2.140 5.057 19.524 2.516
600000 2.793 2.915 2.828 7.299 25.139 3.657
700000 3.500 3.619 3.484 10.067 30.213 5.234
800000 4.328 4.619 4.298 13.047 37.169 6.406
900000 5.141 5.301 5.121 16.364 42.130 8.110
1000000 6.023 6.042 5.984 20.251 46.929 10.016
1100000 6.953 7.027 6.938 24.564 51.142 12.095
1200000 7.924 7.975 7.933 29.095 56.521 14.438
1300000 8.876 9.079 8.876 34.219 60.998 16.986
1400000 9.878 10.079 9.891 39.496 64.321 19.579
1500000 10.997 11.108 10.969 45.355 69.032 22.501
1600000 12.050 12.239 12.047 51.707 74.328 25.693
1700000 13.290 13.548 13.291 58.335 78.431 28.485
1800000 14.441 14.774 14.449 65.279 84.320 32.563
1900000 15.627 15.907 15.835 72.787 90.001 36.222
2000000 16.871 17.243 17.070 80.403 95.113 41.012
Below we present two comparative plots for RA against the quick sort. The figures 1&2 reveal
the superiority of RA sort for discrete uniform and poisson distribution data models for the
specified parameter values.
Figure 1. Plot of n versus y (discrete uniform distribution, k=50)
0
20
40
60
80
100
0 500000 1000000 1500000 2000000 2500000
Y=meantimeinsec.
input size (n)
plot of n versus y (discrete uniformdistribution,k=50)
RA sort (PS) versus quick sort (QS)
PS
QS
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
63
Figure 2. Plot of n versus y (Poisson distribution, λ=4)
Table 2.General Regression Analysis: Y versus n, nlogn, n^2
Box-Cox transformation of the response with specified lambda = 0.75
Regression Equation
Y^0.75 = 0.0383403 - 3.90451e-006 n + 3.88879e-007 nlogn - 4.74249e-014 n^2
Coefficients
Term Coef SE Coef T P
Constant 0.0383403 0.0250803 1.52870 0.146
n -0.0000039 0.0000010 -3.79282 0.002
nlogn 0.0000004 0.0000001 7.38394 0.000
n^2 -0.0000000 0.0000000 -1.06788 0.301
Summary of Model
S = 0.0125067 R-Sq = 100.00% R-Sq(adj) = 100.00%
PRESS = 0.00435678 R-Sq(pred) = 100.00%
Analysis of Variance
Source DF Seq SS Adj SS Adj MS F P
Regression 3 121.313 121.313 40.4378 258523 0.000000
n 1 121.174 0.002 0.0023 14 0.001597
nlogn 1 0.139 0.009 0.0085 55 0.000002
n^2 1 0.000 0.000 0.0002 1 0.301421
Error 16 0.003 0.003 0.0002
Total 19 121.316
Fits and Diagnostics for Unusual Observations for Transformed Response
Obs Y^0.75 Fit SE Fit Residual St Resid
1 0.30243 0.29333 0.0106239 0.0090990 1.37876 X
16 6.46756 6.49317 0.0043949 -0.0256095 -2.18714 R
Fits for Unusual Observations for Original Response
Obs Y Fit
1 0.203 0.1949 X
16 12.050 12.1137 R
0
20
40
60
80
100
0 500000 1000000 1500000 2000000 2500000
Y=meantimeinsec.
input size (n)
plot of n versus y (Poisson distribution)
RA sort (PS) versus quick sort (QS)
PS
QS
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
64
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large leverage.
Durbin-Watson Statistic:
Durbin-Watson statistic = 1.75208
The program runtime data corresponding to the discrete uniform distribution samples is fitted for
a quadratic model of type: y=b0+b1n+b2nlog2n+b3n2
. The response variable is transformed using
Box-Cox transformation [7] technique, with specified lambda equal to 0.75 where it the
parameter of this transformation, to get a suitable model. Below is the corresponding regression
model (complete result is available in table 2).
Regression Equation:
y^0.75 = 0.0383403 - 3.90451e-006 n + 3.88879e-007 nlogn - 4.74249e-014 n^2
As the statistical significance of quadratic term is very weak we ignore it from our model. It
reduces the resulting model as: y^0.75 = 0.0383403 - 3.90451e-006 n + 3.88879e-007
nlogn.Consequently we have y^0.75 = Oemp(nlog2n), which implies that y=Oemp(nlog2n)1/0.75
=
Oemp(nlog2n)1.333
. The standard error of this model is very low (S=0.0125067) and it explains almost
all the variations (as R-Sq(adj) value is equal to 100%). These observations led us to conclude that
the average case complexity of RA sort is: Yavg(n) = Oemp(nlog2n)1.333
.
0.0300.0150.000-0.015-0.030
99
90
50
10
1
Residual
Percent
86420
0.02
0.01
0.00
-0.01
-0.02
Fitted Value
Residual
0.010.00-0.01-0.02
4.8
3.6
2.4
1.2
0.0
Residual
Frequency
2018161412108642
0.02
0.01
0.00
-0.01
-0.02
O bser vation O r der
Residual
No rm al Pro b ab ilit y Plo t Versu s Fit s
Hist o g ram Versus Order
Residual Plots for Y
Figure 3. Residual plots for Y versus n, nlogn, n^2
Interpretation of residual plots for Y:
The normal probability plot suggests that the errors are almost normally distributed. The plot of
residuals versus the fitted value of the response reveals that the distribution of the ɛi has constant
variance for all values of n within the range of experimentation. The plot of residuals versus
observation order suggests that the errors are independently distributed, as there is no clear
pattern in this plot.
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
65
3. CONCLUSIONS
Although the theoretical worst case complexity of RA sort is Yworst(n) = O(n√ ), the experimental
results reveal that with Oemp(nlgn)1.333
time complexity for typical inputs it can perform optimally.
Interestingly, our algorithm, in average case could serve as a better choice for certain distribution
data for which the popular quicksort algorithm is not an efficient choice. We leave the task of
examining the behaviour of RA sort for various continuous distribution inputs as a future work.
The general techniques for simulating the continuous and discrete as well as uniform and non-
uniform random variables can be found in [8]. For a comprehensive literature on sorting, see
references [9] and [10]. For sorting with emphasis on the input distribution, [11] may be
consulted.
System specification: Below is the system specification.
Operating system: Windows 8 Pro (64-bit)
RAM: 4 GB
Hard Disk: 500 GB
Processor: Intel core i5, 2.5 GHz
Compiler: GNU GCC
Language: C++
Note: The algorithm RA sort is named after the initials of the last names of the authors Harsh
Ranjan and Sumit Agarwal.
REFERENCES
[1] Sourabh, Suman Kumar. and Chakraborty, Sourabh, (2007) “On Why Algorithmic Time Complexity
Measure Can be System Invariant Rather than System Independent”, Applied Mathematics and
Computation, Vol. 190, Issue 1, pp. 195-204.
[2] Chakraborty, Soubhik and Sourabh, Suman Kumar, (2010) A Computer Experiment Oriented
Approach toAlgorithmic Complexity, Lambert Academic Publishing.
[3] Hoare, CAR., (1962) “Quicksort”, Computer Journal, 5(1), pp. 10-15.
[4] Fang, KT., Li, R, and Sudjianto, A. (2006) Design and Modeling of Computer Experiments, Chapman
and Hall.
[5] Sacks, Jerome et al.(1989) “Design and Analysis of Experiments”, Statistical Science, Vol.4, No.4,
pp. 409-423.
[6] Mathews, Paul (2010) Design of Experiments with MINITAB, New Age International Publishers,
First Indian Sub-Continent Edition, 294.
[7] Box, GEP. and Cox, DR., (1964) “An Analysis of Transformations”, Journal of Royal Statistical
Society B, Vol.26, pp.211-243.
[8] Ross, Seldom (2001) A First Course in Probability, 6th Edition. Pearson Education.
[9] Knuth, Donald (2000) The Art of Computer Programming, Vol.3: Sorting and Searching, Addison
Wesely, Pearson Education Reprint.
[10] Levitin, Anany (2009) Introduction to the design & Analysis of Algorithms, 2nd ed., Pearson
Education.
[11] Mahmoud, Hosam (2000) Sorting: A Distribution Theory, John Wiley and Sons.
Authors
Harsh Ranjan is an Under Graduate student of Computer Science and Engineering
department in Birla Institute of Technology at Mesra (India). His research interest includes
design and analysis of algorithms.
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015
66
Sumit Agarwal is an Under Graduate student of Computer Science and Engineering
department in Birla Institute of Technology at Mesra (India). His research interest
includes design and analysis of algorithms.
Niraj Kumar Singh is associated to the department of Computer Science and
Engineering at Birla Institute of Technology Mesra (India) as a Teaching Cum
Research Fellow. His research interest includes: Design and analysis of Algorithms,
and Algorithmic complexity analysis through statistical bounds.

More Related Content

PPTX
Varaiational formulation fem
PDF
Quantum algorithm for solving linear systems of equations
PDF
Seminar Report (Final)
PDF
Unger
PPT
CS8451 - Design and Analysis of Algorithms
PDF
International Refereed Journal of Engineering and Science (IRJES)
DOC
algorithm Unit 2
PPT
Rayleigh Ritz Method
Varaiational formulation fem
Quantum algorithm for solving linear systems of equations
Seminar Report (Final)
Unger
CS8451 - Design and Analysis of Algorithms
International Refereed Journal of Engineering and Science (IRJES)
algorithm Unit 2
Rayleigh Ritz Method

What's hot (20)

PPT
S6 l04 analytical and numerical methods of structural analysis
PDF
Advances in composite integer factorization
PDF
PDF
N41049093
PDF
A NEW PARALLEL ALGORITHM FOR COMPUTING MINIMUM SPANNING TREE
PDF
On a Deterministic Property of the Category of k-almost Primes: A Determinist...
PDF
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
PDF
A Framework for Self-Tuning Optimization Algorithm
PDF
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
PPT
20070823
PDF
STATISTICAL ANALYSIS OF FUZZY LINEAR REGRESSION MODEL BASED ON DIFFERENT DIST...
PDF
A MODIFIED VORTEX SEARCH ALGORITHM FOR NUMERICAL FUNCTION OPTIMIZATION
PDF
B02402012022
PPTX
Interpolation and its applications
DOCX
83662164 case-study-1
PDF
At35257260
PDF
Numerical approach of riemann-liouville fractional derivative operator
PPTX
Spectral clustering Tutorial
PDF
On Vector Functions With A Parameter
PDF
A New Enhanced Method of Non Parametric power spectrum Estimation.
S6 l04 analytical and numerical methods of structural analysis
Advances in composite integer factorization
N41049093
A NEW PARALLEL ALGORITHM FOR COMPUTING MINIMUM SPANNING TREE
On a Deterministic Property of the Category of k-almost Primes: A Determinist...
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
A Framework for Self-Tuning Optimization Algorithm
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
20070823
STATISTICAL ANALYSIS OF FUZZY LINEAR REGRESSION MODEL BASED ON DIFFERENT DIST...
A MODIFIED VORTEX SEARCH ALGORITHM FOR NUMERICAL FUNCTION OPTIMIZATION
B02402012022
Interpolation and its applications
83662164 case-study-1
At35257260
Numerical approach of riemann-liouville fractional derivative operator
Spectral clustering Tutorial
On Vector Functions With A Parameter
A New Enhanced Method of Non Parametric power spectrum Estimation.
Ad

Viewers also liked (17)

PDF
Green wsn optimization of energy use
PDF
CAPTURE THE TALENT: SECONDARY SCHOOL EDUCATION WITH CYBER SECURITY COMPETITIONS
PDF
A new model for software costestimation
PDF
Clustbigfim frequent itemset mining of
PDF
PERFORMANCE ANALYSIS OF TEXTURE IMAGE RETRIEVAL FOR CURVELET, CONTOURLET TRAN...
PDF
Edge tenacity in cycles and complete
PDF
Comparative study of different algorithms
PDF
Agent based frameworks for distributed association rule mining an analysis
PDF
Designing digital comprehensive system to test and assess the intelligently b...
PDF
A HYBRID COA/ε-CONSTRAINT METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
PDF
Distribution of maximal clique size under
PDF
Migration strategies for object oriented system to component based system
PDF
Is your shopping cart secure
PDF
An interactive approach to requirements prioritization using quality factors
PDF
Mining of product reviews at aspect level
PDF
Defragmentation of indian legal cases with
PDF
A framework for the evaluation of saas
Green wsn optimization of energy use
CAPTURE THE TALENT: SECONDARY SCHOOL EDUCATION WITH CYBER SECURITY COMPETITIONS
A new model for software costestimation
Clustbigfim frequent itemset mining of
PERFORMANCE ANALYSIS OF TEXTURE IMAGE RETRIEVAL FOR CURVELET, CONTOURLET TRAN...
Edge tenacity in cycles and complete
Comparative study of different algorithms
Agent based frameworks for distributed association rule mining an analysis
Designing digital comprehensive system to test and assess the intelligently b...
A HYBRID COA/ε-CONSTRAINT METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
Distribution of maximal clique size under
Migration strategies for object oriented system to component based system
Is your shopping cart secure
An interactive approach to requirements prioritization using quality factors
Mining of product reviews at aspect level
Defragmentation of indian legal cases with
A framework for the evaluation of saas
Ad

Similar to Design and analysis of ra sort (20)

PDF
A statistical comparative study of
PDF
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
PDF
Study on Sorting Algorithm and Position Determining Sort
PDF
Linear time sorting algorithms
PDF
A unique sorting algorithm with linear time &amp; space complexity
PDF
Daa chapter5
PDF
Sorting and Searching Techniques
PPT
Cis435 week06
PPT
Data Structure (MC501)
PDF
K-Sort: A New Sorting Algorithm that Beats Heap Sort for n 70 Lakhs!
PPT
Introduction
PPT
Algorithm in Computer, Sorting and Notations
PDF
Average sort
DOCX
Selection sort lab mannual
PPT
Counting Sort Lowerbound
PPT
Algorithm
PPT
Algorithm
PPT
Insertion sort bubble sort selection sort
PPT
358 33 powerpoint-slides_14-sorting_chapter-14
A statistical comparative study of
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
Study on Sorting Algorithm and Position Determining Sort
Linear time sorting algorithms
A unique sorting algorithm with linear time &amp; space complexity
Daa chapter5
Sorting and Searching Techniques
Cis435 week06
Data Structure (MC501)
K-Sort: A New Sorting Algorithm that Beats Heap Sort for n 70 Lakhs!
Introduction
Algorithm in Computer, Sorting and Notations
Average sort
Selection sort lab mannual
Counting Sort Lowerbound
Algorithm
Algorithm
Insertion sort bubble sort selection sort
358 33 powerpoint-slides_14-sorting_chapter-14

More from ijfcstjournal (20)

PDF
DESIGNING DIGITAL COMPREHENSIVE SYSTEM TO TEST AND ASSESS THE INTELLIGENTLY B...
PDF
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
PDF
Call For Papers - 12th International Conference on Information Technology, Co...
PDF
ENHANCING COLLEGE ENGLISH COURSE EVALUATION THROUGH INTERNET-PLUS TOOLS AT GE...
PDF
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
PDF
Benchmarking Large Language Models with a Unified Performance Ranking Metric
PDF
NEW APPROACH FOR SOLVING SOFTWARE PROJECT SCHEDULING PROBLEM USING DIFFERENTI...
PDF
Call For Papers - 15th International Conference on Computer Science, Engineer...
PDF
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
PDF
SEGMENTATION AND RECOGNITION OF HANDWRITTEN DIGIT NUMERAL STRING USING A MULT...
PDF
Multiprocessor Scheduling of Dependent Tasks to Minimize Makespan and Reliabi...
PDF
PATTERN RECOGNITION USING CONTEXTDEPENDENT MEMORY MODEL (CDMM) IN MULTIMODAL ...
PDF
Call For Papers - 12th International Conference on Foundations of Computer Sc...
PDF
PERFORMANCE ANALYSIS OF TEXTURE IMAGE RETRIEVAL FOR CURVELET, CONTOURLET TRAN...
PDF
A DECISION SUPPORT SYSTEM FOR ESTIMATING COST OF SOFTWARE PROJECTS USING A HY...
PDF
A MODIFIED DNA COMPUTING APPROACH TO TACKLE THE EXPONENTIAL SOLUTION SPACE OF...
PDF
THE RISK ASSESSMENT AND TREATMENT APPROACH IN ORDER TO PROVIDE LAN SECURITY B...
PDF
Call For Papers - 12th International Conference on Foundations of Computer Sc...
PDF
Modelling of Walking Humanoid Robot With Capability of Floor Detection and Dy...
PDF
Providing A Model For Selecting Information Security Control Objectives Using...
DESIGNING DIGITAL COMPREHENSIVE SYSTEM TO TEST AND ASSESS THE INTELLIGENTLY B...
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
Call For Papers - 12th International Conference on Information Technology, Co...
ENHANCING COLLEGE ENGLISH COURSE EVALUATION THROUGH INTERNET-PLUS TOOLS AT GE...
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
Benchmarking Large Language Models with a Unified Performance Ranking Metric
NEW APPROACH FOR SOLVING SOFTWARE PROJECT SCHEDULING PROBLEM USING DIFFERENTI...
Call For Papers - 15th International Conference on Computer Science, Engineer...
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
SEGMENTATION AND RECOGNITION OF HANDWRITTEN DIGIT NUMERAL STRING USING A MULT...
Multiprocessor Scheduling of Dependent Tasks to Minimize Makespan and Reliabi...
PATTERN RECOGNITION USING CONTEXTDEPENDENT MEMORY MODEL (CDMM) IN MULTIMODAL ...
Call For Papers - 12th International Conference on Foundations of Computer Sc...
PERFORMANCE ANALYSIS OF TEXTURE IMAGE RETRIEVAL FOR CURVELET, CONTOURLET TRAN...
A DECISION SUPPORT SYSTEM FOR ESTIMATING COST OF SOFTWARE PROJECTS USING A HY...
A MODIFIED DNA COMPUTING APPROACH TO TACKLE THE EXPONENTIAL SOLUTION SPACE OF...
THE RISK ASSESSMENT AND TREATMENT APPROACH IN ORDER TO PROVIDE LAN SECURITY B...
Call For Papers - 12th International Conference on Foundations of Computer Sc...
Modelling of Walking Humanoid Robot With Capability of Floor Detection and Dy...
Providing A Model For Selecting Information Security Control Objectives Using...

Recently uploaded (20)

PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
CloudStack 4.21: First Look Webinar slides
PPTX
Build Your First AI Agent with UiPath.pptx
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PPTX
TEXTILE technology diploma scope and career opportunities
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
Five Habits of High-Impact Board Members
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
NewMind AI Weekly Chronicles – August ’25 Week IV
CloudStack 4.21: First Look Webinar slides
Build Your First AI Agent with UiPath.pptx
Custom Battery Pack Design Considerations for Performance and Safety
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Microsoft Excel 365/2024 Beginner's training
OpenACC and Open Hackathons Monthly Highlights July 2025
TEXTILE technology diploma scope and career opportunities
sustainability-14-14877-v2.pddhzftheheeeee
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Enhancing plagiarism detection using data pre-processing and machine learning...
Basics of Cloud Computing - Cloud Ecosystem
Five Habits of High-Impact Board Members
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
sbt 2.0: go big (Scala Days 2025 edition)
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions

Design and analysis of ra sort

  • 1. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 DOI:10.5121/ijfcst.2015.5106 59 DESIGN AND ANALYSIS OF RA SORT Harsh Ranjan1 , Sumit Agarwal1 and Niraj Kumar Singh*1 1 Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, India ABSTRACT This paper introduces a new comparison base stable sorting algorithm, named RA sort. The RA sort involves only the comparison of pair of elements in an array which ultimately sorts the array and does not involve the comparison of each element with every other element. It tries to build upon the relationship established between the elements in each pass. Instead of going for a blind comparison we prefer a selective comparison to get an efficient method. Sorting is a fundamental operation in computer science. This algorithm is analysed both theoretically and empirically to get a robust average case result. We have performed its Empirical analysis and compared its performance with the well-known quick sort for various input types. Although the theoretical worst case complexity of RA sort is Yworst(n) = O(n√ ), the experimental results suggest an empirical Oemp(nlgn)1.333 time complexity for typical input instances, where the parameter n characterizes the input size. The theoretical complexity is given for comparison operation. We emphasize that the theoretical complexity is operation specific whereas the empirical one represents the overall algorithmic complexity. KEYWORDS Algorithm, quick sort, RA sort, Theoretical Analysis, Empirical Analysis, Uniform Distribution Model, Poisson Model, Binomial Model. 1. INTRODUCTION This paper introduces a new comparison base stable sorting algorithm, named RA sort. Though many sorting algorithms have been developed, no single technique is best suited for all applications. In basic comparison sort algorithm we need to check each element with the rest of the array in order to find its appropriate location. In RA sort we only need to compare selective pairs of whose elements are distant apart in a defined manner. This selective comparison among elements saves a fair amount of time and comparisons. Although the theoretical worst case complexity of RA sort is Yworst(n) = O(n√ ), the experimental results reveal that with Oemp(nlgn)1.333 time complexity for typical inputs it can perform optimally. 2. ALGORITHM RA SORT The RA Sort algorithm involves only the comparison of pair of elements in an array which ultimately sorts the array and does not involve the comparison of each element with every other element. This algorithm tries to build upon the relationship established between the elements in each pass. For an input (a1, a2, a3) let a1<a2 and a2<a3, then it can be easily inference that a1<a3 and so there is no need to compare a1 and a3. The algorithm uses this technique to place each element in their appropriate location by saving significantly large number of comparisons.
  • 2. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 60 The RA sort first determines the minimum length such that all elements get placed in their appropriate locations. This length refers to the maximum forward distance a particular index can be compared with. It starts with this length and goes down to one at each point comparing every element only with one element that is a fixed length forward to it. This minimum value of length can be easily found out by binary search as for all values greater than this length the array will be sorted and for all values less than it, it will be partially sorted. If there is an array of four elements then initially a1 needed to be compared with a2, a3, a4 and same goes for a2, a3 and a4 before completing the sorting but if length equals two then in first loop a1 is compared only with a3 and a2 is compared only with a4 and in the second loop when length decrements by one a1 gets compared only with a2, a2 gets compared only with a3 and a3 gets compared only with a4. These comparisons takes place one by one and at each point the value at an index might change and the updated value at that index gets used for future comparisons. Thus in only five comparisons when length equals to two and three when length equals to one we have sorted the entire array instead of a total of twelve. This differences increases greatly as the size of array increases. This minimum length does not have a general formula which can be given for all input size but a rough estimate can be made which gives the minimum value for most of the cases and for few cases it gives a slightly higher value which ultimately does the sorting job perfectly. Let us denote this minimum values by K. Then K=T *lgn where T = √ ∗ Derivation of T: Let n denote the input size of a sample and h equals to lgn. Maximum jump required by any element to go to its correct position = n-1 (smallest element is at the last position or largest at first.) After x iteration maximum jump that can be made by any element from its given location by RA sort is 1+2+3+…+x. Multiplying x by h and summing the above series we get (x*h)*(x*h+1)/2. Now this value needs to be greater than n-1 so that every element can reach its appropriate location in worst case. On comparing them: (x*h)*(x*h+1)/2≥n (Replacing n by n- 1 for calculation ease.) Considering x*h=z, we have: z*(z+1) ≥ 2*n  z2 + z-2 * n ≥ 0  z = (-1+√1 + 8 )/2, andsince z=x*h, we get x = [(1 + 8 ) − 1]/2 . Thus T=⌈ ⌉, for covering boundary cases at some places. On solving this quadratic relation for x since h is a constant gives the required formula for T as T= √ ∗ .The minimum length is given as T*lgn. It can be seen that in general case any length less than this can’t sort the array totally as each element would not end up at their appropriate location and every length greater will. Algorithm: RA_Sort (A, n): //Here A [0: n-1] is the input array of size n FOR (I=T*⌈ ⌉; I≥1; I=I-1) // T = √ ∗ FOR (J=0; J+I< n; J=J+1) IF (A[J]>A [J+I]) Exchange (A[J], A [J+I])
  • 3. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 61 Analysis is an integral part of any algorithm which gives an idea of the resources to be consumed during its implementation. The RA sort algorithm is analyzed both theoretically and empirically. The theoretical analysis is well suited to obtain worst case results as it in some sense gives a guarantee in terms of performance. At the same time this guarantee can also be conservative. In such cases a certificate on the level of conservativeness is necessary and is given by the novel concept of empirical O discussed in this paper. We have done theoretical analysis over comparison operation to get its worst case performance in terms of big-oh notation. Average case analysis is done using statistical bound estimate (also called empirical-O). Reader is suggested to see the references [1] and [2] to get insight into empirical-O. 2.1.1. Theoretical Analysis (Worst Case Only) Referring to the pseudo code (it contains two for loops) of RA sort its runtime complexity is expressed as the following summation equation: Y(n) = ∑ 1 ∑ (1) The worst case equation is: Yworst(n) = O(T*lgn*n) = O(n√ ), which is obtained by substitutingT = √ ∗ . 2.1.2. Empirical Analysis (Average Case Only) This section includes empirical results obtained for average case analysis of RA and quick sorts. The algorithm was run for data obtained from various uniform and non-uniform discrete distribution data model like Uniform distribution, Poisson distribution and Binomial distribution. The performance of RA sort is also compared with standard version of quick sort algorithm [3] for the similar input types. The observed mean time (in sec) of 1000 trials was noted in table (1). Average case analysis was done by directly working on program run time to estimate the weight based statistical bound over a finite range by running computer experiments [4] and [5]. This estimate is called empirical O [1] and [2]. Here time of an operation is taken as its weight. Weighing permits collective consideration of all operations into a conceptual bound which we call a statistical bound in order to distinguish it from the count based mathematical bounds that are operation specific. The way we design and analyze our computer experiment has certainly a great impact on the credibility of empirical-O. See reference [2] for more insight into the philosophy behind statistical bound and empirical-O. The statistical analysis and the various interpretations are guided by [6]. The samples are generated randomly, using a random number generating function, to characterize discrete uniform, poisson, and binomial distribution models with k, λ, and (m, p) as its respective parameters. Our sample sizes lie in between 1*105 and 20*105 .
  • 4. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 62 Table 1. Observed mean times in second(s) for Pair and quick sorts RA SORT QUICK SORT N DU k = 50 Poisson λ = 4 Binomial (400, 0.5) DU k = 50 Poisson λ = 4 Binomial (400, 0.5) 100000 0.203 0.187 0.187 0.218 1.422 0.125 200000 0.531 0.515 0.531 0.812 5.688 0.421 300000 0.969 0.937 1.000 1.811 9.883 0.938 400000 1.500 1.508 1.515 3.213 14.313 1.641 500000 2.112 2.161 2.140 5.057 19.524 2.516 600000 2.793 2.915 2.828 7.299 25.139 3.657 700000 3.500 3.619 3.484 10.067 30.213 5.234 800000 4.328 4.619 4.298 13.047 37.169 6.406 900000 5.141 5.301 5.121 16.364 42.130 8.110 1000000 6.023 6.042 5.984 20.251 46.929 10.016 1100000 6.953 7.027 6.938 24.564 51.142 12.095 1200000 7.924 7.975 7.933 29.095 56.521 14.438 1300000 8.876 9.079 8.876 34.219 60.998 16.986 1400000 9.878 10.079 9.891 39.496 64.321 19.579 1500000 10.997 11.108 10.969 45.355 69.032 22.501 1600000 12.050 12.239 12.047 51.707 74.328 25.693 1700000 13.290 13.548 13.291 58.335 78.431 28.485 1800000 14.441 14.774 14.449 65.279 84.320 32.563 1900000 15.627 15.907 15.835 72.787 90.001 36.222 2000000 16.871 17.243 17.070 80.403 95.113 41.012 Below we present two comparative plots for RA against the quick sort. The figures 1&2 reveal the superiority of RA sort for discrete uniform and poisson distribution data models for the specified parameter values. Figure 1. Plot of n versus y (discrete uniform distribution, k=50) 0 20 40 60 80 100 0 500000 1000000 1500000 2000000 2500000 Y=meantimeinsec. input size (n) plot of n versus y (discrete uniformdistribution,k=50) RA sort (PS) versus quick sort (QS) PS QS
  • 5. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 63 Figure 2. Plot of n versus y (Poisson distribution, λ=4) Table 2.General Regression Analysis: Y versus n, nlogn, n^2 Box-Cox transformation of the response with specified lambda = 0.75 Regression Equation Y^0.75 = 0.0383403 - 3.90451e-006 n + 3.88879e-007 nlogn - 4.74249e-014 n^2 Coefficients Term Coef SE Coef T P Constant 0.0383403 0.0250803 1.52870 0.146 n -0.0000039 0.0000010 -3.79282 0.002 nlogn 0.0000004 0.0000001 7.38394 0.000 n^2 -0.0000000 0.0000000 -1.06788 0.301 Summary of Model S = 0.0125067 R-Sq = 100.00% R-Sq(adj) = 100.00% PRESS = 0.00435678 R-Sq(pred) = 100.00% Analysis of Variance Source DF Seq SS Adj SS Adj MS F P Regression 3 121.313 121.313 40.4378 258523 0.000000 n 1 121.174 0.002 0.0023 14 0.001597 nlogn 1 0.139 0.009 0.0085 55 0.000002 n^2 1 0.000 0.000 0.0002 1 0.301421 Error 16 0.003 0.003 0.0002 Total 19 121.316 Fits and Diagnostics for Unusual Observations for Transformed Response Obs Y^0.75 Fit SE Fit Residual St Resid 1 0.30243 0.29333 0.0106239 0.0090990 1.37876 X 16 6.46756 6.49317 0.0043949 -0.0256095 -2.18714 R Fits for Unusual Observations for Original Response Obs Y Fit 1 0.203 0.1949 X 16 12.050 12.1137 R 0 20 40 60 80 100 0 500000 1000000 1500000 2000000 2500000 Y=meantimeinsec. input size (n) plot of n versus y (Poisson distribution) RA sort (PS) versus quick sort (QS) PS QS
  • 6. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 64 R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large leverage. Durbin-Watson Statistic: Durbin-Watson statistic = 1.75208 The program runtime data corresponding to the discrete uniform distribution samples is fitted for a quadratic model of type: y=b0+b1n+b2nlog2n+b3n2 . The response variable is transformed using Box-Cox transformation [7] technique, with specified lambda equal to 0.75 where it the parameter of this transformation, to get a suitable model. Below is the corresponding regression model (complete result is available in table 2). Regression Equation: y^0.75 = 0.0383403 - 3.90451e-006 n + 3.88879e-007 nlogn - 4.74249e-014 n^2 As the statistical significance of quadratic term is very weak we ignore it from our model. It reduces the resulting model as: y^0.75 = 0.0383403 - 3.90451e-006 n + 3.88879e-007 nlogn.Consequently we have y^0.75 = Oemp(nlog2n), which implies that y=Oemp(nlog2n)1/0.75 = Oemp(nlog2n)1.333 . The standard error of this model is very low (S=0.0125067) and it explains almost all the variations (as R-Sq(adj) value is equal to 100%). These observations led us to conclude that the average case complexity of RA sort is: Yavg(n) = Oemp(nlog2n)1.333 . 0.0300.0150.000-0.015-0.030 99 90 50 10 1 Residual Percent 86420 0.02 0.01 0.00 -0.01 -0.02 Fitted Value Residual 0.010.00-0.01-0.02 4.8 3.6 2.4 1.2 0.0 Residual Frequency 2018161412108642 0.02 0.01 0.00 -0.01 -0.02 O bser vation O r der Residual No rm al Pro b ab ilit y Plo t Versu s Fit s Hist o g ram Versus Order Residual Plots for Y Figure 3. Residual plots for Y versus n, nlogn, n^2 Interpretation of residual plots for Y: The normal probability plot suggests that the errors are almost normally distributed. The plot of residuals versus the fitted value of the response reveals that the distribution of the ɛi has constant variance for all values of n within the range of experimentation. The plot of residuals versus observation order suggests that the errors are independently distributed, as there is no clear pattern in this plot.
  • 7. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 65 3. CONCLUSIONS Although the theoretical worst case complexity of RA sort is Yworst(n) = O(n√ ), the experimental results reveal that with Oemp(nlgn)1.333 time complexity for typical inputs it can perform optimally. Interestingly, our algorithm, in average case could serve as a better choice for certain distribution data for which the popular quicksort algorithm is not an efficient choice. We leave the task of examining the behaviour of RA sort for various continuous distribution inputs as a future work. The general techniques for simulating the continuous and discrete as well as uniform and non- uniform random variables can be found in [8]. For a comprehensive literature on sorting, see references [9] and [10]. For sorting with emphasis on the input distribution, [11] may be consulted. System specification: Below is the system specification. Operating system: Windows 8 Pro (64-bit) RAM: 4 GB Hard Disk: 500 GB Processor: Intel core i5, 2.5 GHz Compiler: GNU GCC Language: C++ Note: The algorithm RA sort is named after the initials of the last names of the authors Harsh Ranjan and Sumit Agarwal. REFERENCES [1] Sourabh, Suman Kumar. and Chakraborty, Sourabh, (2007) “On Why Algorithmic Time Complexity Measure Can be System Invariant Rather than System Independent”, Applied Mathematics and Computation, Vol. 190, Issue 1, pp. 195-204. [2] Chakraborty, Soubhik and Sourabh, Suman Kumar, (2010) A Computer Experiment Oriented Approach toAlgorithmic Complexity, Lambert Academic Publishing. [3] Hoare, CAR., (1962) “Quicksort”, Computer Journal, 5(1), pp. 10-15. [4] Fang, KT., Li, R, and Sudjianto, A. (2006) Design and Modeling of Computer Experiments, Chapman and Hall. [5] Sacks, Jerome et al.(1989) “Design and Analysis of Experiments”, Statistical Science, Vol.4, No.4, pp. 409-423. [6] Mathews, Paul (2010) Design of Experiments with MINITAB, New Age International Publishers, First Indian Sub-Continent Edition, 294. [7] Box, GEP. and Cox, DR., (1964) “An Analysis of Transformations”, Journal of Royal Statistical Society B, Vol.26, pp.211-243. [8] Ross, Seldom (2001) A First Course in Probability, 6th Edition. Pearson Education. [9] Knuth, Donald (2000) The Art of Computer Programming, Vol.3: Sorting and Searching, Addison Wesely, Pearson Education Reprint. [10] Levitin, Anany (2009) Introduction to the design & Analysis of Algorithms, 2nd ed., Pearson Education. [11] Mahmoud, Hosam (2000) Sorting: A Distribution Theory, John Wiley and Sons. Authors Harsh Ranjan is an Under Graduate student of Computer Science and Engineering department in Birla Institute of Technology at Mesra (India). His research interest includes design and analysis of algorithms.
  • 8. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.5, No.1, January 2015 66 Sumit Agarwal is an Under Graduate student of Computer Science and Engineering department in Birla Institute of Technology at Mesra (India). His research interest includes design and analysis of algorithms. Niraj Kumar Singh is associated to the department of Computer Science and Engineering at Birla Institute of Technology Mesra (India) as a Teaching Cum Research Fellow. His research interest includes: Design and analysis of Algorithms, and Algorithmic complexity analysis through statistical bounds.