0% found this document useful (0 votes)

20 views12 pages

A Machine Learning Approach To Classifying Constru

Uploaded by

dengaleabhijit6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views12 pages

A Machine Learning Approach To Classifying Constru

Uploaded by

dengaleabhijit6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

A Machine Learning Approach to Classifying Construction Cost Documents into

the International Construction Measurement Standard

J. Ignacio Dezaa,∗, Hisham Ihshaishb,∗, Lamine Mahdjoubia

Faculty of Environment and Technology, University of the West of England, Bristol, BS16 1QY, United Kingdom

a Centre for Architecture and Built Environment Research (CABER).

b Computer Science Research Centre (CSRC) and Mathematics and Statistics Research Group (MSRG).
arXiv:2211.07705v1 [cs.CL] 24 Oct 2022

Abstract
We introduce the first automated models for classifying natural language descriptions provided in cost documents
called “Bills of Quantities” (BoQs) popular in the infrastructure construction industry, into the International Construc-
tion Measurement Standard (ICMS). The presented analysis and models are aimed at vitalising the adoption of ICMS
and thus providing benchmarkers with an effective automated tool to allow for project comparison in a more granular
way. The presented study addresses these challenges and sets forth models to facilitate widespread analysis of cost and
performance in infrastructure construction projects effectively. The models we deployed and systematically evaluated
for multi-class text classification are learnt from a dataset of more than 50 thousand descriptions of items retrieved
from 24 large infrastructure construction projects across the United Kingdom.
We describe our approach to language representation and subsequent modelling to examine the strength of con-
textual semantics and temporal dependency of language used in construction project documentation. To do that we
evaluate two experimental pipelines to inferring ICMS codes from text, on the basis of two different language rep-
resentation models and a range of state-of-the-art sequence-based classification methods, including recurrent and
convolutional neural network architectures.
The findings indicate a highly effective and accurate ICMS automation model is within reach, with reported ac-
curacy results above %90 F1 score on average, on 32 ICMS categories. Furthermore, due to the specific nature of
language use in the BoQs text; short, largely descriptive and technical, we find that simpler models compare favourably
to achieving higher accuracy results. Our analysis suggest that information is more likely embedded in local key fea-
tures in the descriptive text, which explains why a simpler generic temporal convolutional network (TCN) exhibits
comparable memory to recurrent architectures with the same capacity, and subsequently outperforms these at this
task.
Keywords: Natural language processing (NLP), Deep learning, Automation in Building Information Modelling
(BIM), Artificial Intelligence (AI), ICMS, Short text classification, Recurrent and convolutional neural networks
(LSTM, GRU, CNN), Temporal convolutional networks (TCN).

1. Introduction where each car was slightly different, done in slightly

different ways and with nearly ten times more effort than
One of the biggest challenges the construction indus- a single standardised vehicle[1, 2]. The impact is real.
try is facing worldwide is standardisation. Compared to Globally, 98% of infrastructure projects are over budget
other big industries (manufacturing, software, financial or delayed, with an average of 80% over budget and at
and medical services being just a few paradigmatic ex- least 20 months late. Construction’s productivity is also
amples), construction projects are still lagging behind. lagging global productivity by over 30%.
Their handling remains largely as a craft, reminiscent of
the pre- Henry Ford assembly-line manufacturing era,

∗ Corresponding Authors: J. I. Deza and H. Ihshaish (e-mail:

{ignacio.deza, hisham.ihshaish}@uwe.ac.uk).

Preprint submitted to Engineering Applications of Artificial Intelligence November 16, 2022

If the productivity of the construction industry in some situations, to the country, where a standard set
matched average global productivity, it would pay for of measurement rules has been imposed, like the NRM
50% of the total demand of infrastructure[3]. At the end standard[8] used in the UK for the construction of build-
of the road to standardisation lies a pot of gold worth ings.
billions. Due to the aforementioned reasons, it has been very
There can be several causes to this phenomenon: difficult to establish and adopt a universal standard,
firstly, whereas manufacturing is done in a controlled as most contenders are country-specific or industry-
environment where pieces can be consistently made specific. Usually such standards are designed to be
to specifications, construction (in the brick-and-mortar highly granular, with hundreds or even thousands of
sense) is usually done on site and it is subject to the categories, which makes it all the more challenging to
weather, logistics, terrain problems and other issues[4]; compare like-for-like when benchmarking BoQs from
secondly, whereas a company is usually responsible for different origins. For this reason there has been an ef-
the production of a good or service, in construction fort from a group of 49 professional and not-for-profit
many companies and subcontractors across many trades organisations from around the world, working towards
have their responsibilities intertwined which results developing and implementing an international standard
in liability issues when under-performance occurs[5]; for benchmarking, measuring and reporting construc-
thirdly, construction tends to employ unskilled work- tion project costs. This group created the International
ers – not necessarily being trained to perform processes Cost Management Standard (ICMS).
that are repeatable, replicable, and linear in nature - The ICMS[9, 10] aims to provide global consistency
whose performance tends to be harder to measure[5, 2]; in classifying, defining, measuring, analysing and pre-
and lastly, data recollection in other industries usually senting entire construction costs at a project, regional,
follows certain standards and specifications which are state, national or international level. ICMS is a cost
cross discipline and cross country, enabling e.g. interna- classification system, and contrary to many word-break
tional commerce, and ensuring predictability and wide structures, the project has global coverage, with the
compatibility with other products, projects and applica- scope of standardising entire construction cost docu-
tions; in the construction industry, nearly every com- ments. Its purpose is not to replace other–more specific–
pany records their data in slightly different and usually methods of measurement, but to complement them. It is
proprietary ways, catered to their own specific needs. designed in an elementary way, and–by design–it is not
The first three issues are being addressed by the use highly granular, thus allowing different projects from
of new technologies like BIM[6, 1], digital twins[4, 7], different sectors (and different countries) to be compa-
and especially by the rising use of Offsite Construc- rable at that level.
tion[4, 6], which brings new technologies to the con- ICMS is focused on many aspects of construction
struction site by manufacturing standardised parts in an projects: capital costs (as standardisation brings signifi-
assembly line and shipping them only to be assembled cant benefits to construction cost management), life cy-
on site. This technique is gaining popularity due to cle costs (reflecting the pivotal role of financial man-
its more predictable nature, their use of standardised agement in construction), and also ‘carbon emissions’
processes and their repeatability. The fourth issue is— (carbon dioxide (CO2 ) equivalent) accountable in a way
however—more challenging, as it requires for the whole akin to monetary cost[11].
industry to come together around a standard. The standard has been successfully implemented by
Materials and work in the construction industry are many bodies, including the Africa Association of Quan-
usually billed using documents called Bills of Quanti- tity Surveyors (AAQS), the China Cost Engineering As-
ties (BoQs)[5] which contain a free text (natural lan- sociation (CCEA), the European Federation of Engi-
guage) description of the material or work conducted neering Consultancy Associations (EFCA), the Interna-
together with a price breakdown. This is one of the tional Cost Engineering Council (ICEC), and the Royal
elementary types of billing document and many types Institution of Chartered Surveyors (RICS), among oth-
of contracts in construction (i.e. with sub-contractors, ers. It is currently the sole contender to become a
etc. . . ) can be traced back to BoQs. Some of these doc- global standard to allow a meaningful comparative anal-
uments may also include a code which defines the type ysis inside and between countries, by international or-
of work, or the material. Unfortunately, these codes lack ganisations such as the World Bank Group, the Inter-
standardisation and tend to be internal to the company, national Monetary Fund, various regional development
to the industry—usually reflecting industry needs like banks, non-governmental organisations and the United
tracks and electrification for train infrastructure—and Nations[10, 9].
2
BOQ A ICMS BOQ B

1.04.040
Text Description Price Breakdown Text Description Price Breakdown

A1 1.06.020 B1

A2 1.08.060 B2

A3 1.01.060 B3

A4 1.08.010 B4

A5 1.02.050 B5

A6 1.02.030 B6

A7 1.05.080 B7

Figure 1: ICMS is designed to allow comparison between BoQ-like documents which–in general–don’t have means of comparison apart from the
text description. Although many very detailed rules of measurement and naming conventions exist, they tend to be so granular that the comparison
between items becomes harder because similar items fall in different categories. For this reason, ICMS is designed as a high-level, elementary cost
and carbon measurement standard, with the potential to make cost documents more transparent and international.

For these reasons the ICMS project presents itself as The contextual information in the analysed texts is
a very high profile venture, with the potential to disrupt naturally present, albeit simpler, compared to the com-
the present construction methodologies towards a more plexity often embedded in other types of natural lan-
transparent, international and sustainable future. guage. This is largely due to the fact that BoQs de-
The work presented in this article aims to foster adop- scriptions are considerably condensed, short and strictly
tion of this new standard. Adoption of a common stan- descriptive, as they are essentially intended to be infor-
dard is usually difficult unless there is immediate gain mative; with neither emotional nor opinion components
by the participants. As with many processes in the con- to them. This can add to the challenge of extracting the
struction industry, classification of BoQs is usually done semantics for the classification effort compared to other
manually. This makes adoption of an extra standard— types of short texts from media or tweets.
focused on benchmarking and optimisation, rather than Moreover, the classification of the description of
on day-to-day operations—an extra burden that more tasks in natural language into a set of categories re-
often than not will tend to be avoided. Fortunately, the quires a certain degree of interpretation. This problem
Department for Transport (DfT)1 of the UK has directed aggravates when there are multiple people performing
many Government-owned-Companies who administer the classifications and if the description lacks proper
parts of its infrastructure to comply with the ICMS. This context. For example, all the load bearing works un-
study, which comes under the TIES Living Labs[12] derground or underwater must be classified as “sub-
project sponsored by the UK Government, is in line with structure”, but when they are over ground they become
these efforts. “structure”. There are however many structures that can
be partially buried due to terrain issues, slopes, etc. . . so
2. Data and Methods the classification of such items may well be down to
subjective judgment. This way a great variety of slightly
The natural language descriptions found in the BoQs
different classifications can occur naturally in manual
are predominantly short. Similar to texts analysed in
classification, which accentuates the need for an impar-
studies of sentiment analysis[13], dialogue systems[14,
tial classifier which can resolve these issues in an objec-
15, 16] and user query intent understanding[17], among
tive way. As the purpose of this standard is to compare
others, inferring such text is known to be especially
like-for-like, even a systematically erroneous classifi-
challenging. This is because of the often limited con-
cation is desirable over classifications around a theme
textual information they are accompanied by, compared
which won’t match most items as intended.
to that of long texts found in books and documents.
In this section we describe the dataset and the mod-
1 https://www.gov.uk/government/organisations/department-for- eling methods we applied to automate the classification
transport of BoQs into the ICMS standard.
3
2.1. Data acquisition and pre-processing • “Geophysical Survey in accordance with drawing
As a part of a the TIES Living Lab, a total of 124 XX.”
thousand Materials and Costs items, defined in natu-
ral language and originally labelled manually in ICMS, • “Termination of optic fibre cable to XX equipment
where retrieved from a total of 24 projects from a major cabinet Type YY.”
UK-based infrastructure construction company.
The data is presented as a cost documents, which in- • “Power reduction joint of XX mm2 to XX mm2.”
clude the ICMS code, the free text description to be
analysed and a price breakdown per item. The prices Since the retrieved descriptions have been recorded
were not considered in the study. by a large number of different people, varying levels of
Each project contains several thousand lines of cost complexities inherent in natural language were present
descriptions, written in natural language presumably by as a result. We found distinguishable differences in
subcontractors executing the tasks or delivering the ma- detail level provided across data samples—e.g. ref-
terials along the supply chain. These pieces of text are erences to internal codes, drawings, diagrams, scales,
relatively short (the median length of the descriptions in etc. . . The recording is additionally found in many in-
the dataset is 14 words with a maximum of 160 and a consistent ways, i.e. ‘cable 10 m’, ‘cable 10 meter’,
minimum of only one word). The encoding (each sam- ‘cable 10m’, ‘ten metre cable’ which essentially are in-
ple mapped to an ICMS code) has been performed man- tended to record the same information. Additionally
ually by Quantity Surveyor Experts with the support of many words have been found misspelled in at least one
the (British) Royal Institution of Chartered Surveyors way.
(RICS2 ), which is part of the ICMS Coalition which Such inconsistencies were considered as data
aims at promoting a widespread adoption of ICMS as was cleansed, and special characters—including
a global standard. punctuation—and numbers were removed.
The number of unique ICMS categories with at least
one entry in the original dataset is 72 (from the over- 2.2. Classification Methods
all total of 109 categories present in the cost side of the
standard). However, many of these categories contained For language representation and subsequently mod-
only a handful of items and were as a result discarded elling, we considered two different approaches; an
in this study. Only 32 categories contained sufficient explicit representation for text with a vector space
samples for their use in the presented study, reducing model[18] based on term(s) occurrence, and an implicit
the total size of the dataset from 123210 to 51906 items representation, using a word embedding approach[19]
–having additionally removed duplicated samples. We to text so that contextual semantics, beyond term occur-
established a cut-off of 250 samples per ICMS category rence, are represented to learn the corresponding target
such that all ICMS categories with less number of sam- labels.
ples were removed from the dataset. The distribution of For term occurrence models we used the popular n-
items/samples per each ICMS encoding is provided in gram “bag-of-words” (BoW) whereby each unique term
Fig. 2. (or n terms) is considered as an independent dimen-
The following synthetic examples bear a very close sion of the terms space, and is “one-hot” encoded as a
resemblance of the text analysed in this study, both in sparse vector. Different weightings for term occurrence
wording and in length – the original data is protected were additionally evaluated; a binary one-hot encod-
by a non-disclosure agreement, and therefore cannot be ing, term frequency and the popular Term Frequency-
shared publicly at this stage: Inverse Document Frequency (TF-IDF)[20]. Although
popular, allowing models to learn corresponding targets
• “Galvanised high adherence reinforcing strips act- based on local key features, this approach is nonetheless
ing as soil reinforcement.” limited as it considers terms in text to be independent,
• “Take down and remove to tip off Site unlit traffic and as a result the semantic term-term dependence is
sign including 4 posts.” entirely undermined.
On the other hand, word vectors[19], also known
• “Installation of wildlife tunnel XX m in length as as word embeddings, provide a much more semantic-
per diagram XX.” aware representation for language, where each word (in
the vocabulary, V) is embedded into a real-valued vector
2 https://www.rics.org/uk/ in a dense space of concepts, of dimension d << |V|.
4
Number of Samples per ICMS Category
7000
Cutoff: 250 samples
250
6000
200
5000 150

100
4000
Samples

50
3000
0

1.02.010
1.04.010
1.08.070
1.08.130
1.02.100
1.08.100
1.01.080
1.08.030
1.03.040
1.08.120
1.01.110
1.08.050
2.03.020
1.04.030
1.10.010
2.03.030
1.01.040
3.01.020
3.02.020
2.03.010
1.08.110
1.08.080
1.10.020
3.02.040
2000
ICMS
1000

0
1.04.040
1.06.020
1.08.060
1.06.050
1.01.060
1.08.010
1.02.050
1.02.030
1.05.080
1.08.020
1.02.080
1.04.070
1.07.040
1.04.050
1.05.060
1.03.030
1.02.020
1.03.070
1.01.010
1.05.020
1.01.050
1.07.030
1.03.090
1.02.060
1.02.070
2.01.020
1.06.010
1.08.040
1.08.090
1.05.030
1.03.060
1.05.040
1.01.020
1.02.010
1.04.010
1.08.070
1.08.130
1.02.100
1.08.100
1.01.080
1.08.030
1.03.040
1.08.120
1.01.110
1.08.050
2.03.020
1.04.030
1.10.010
2.03.030
1.01.040
3.01.020
3.02.020
2.03.010
1.08.110
1.08.080
1.10.020
3.02.040
ICMS codes
Figure 2: A histogram of the dataset; sample numbers per ICMS codes showing data imbalance. A cutoff of 250 is applied to exclude the overly
under-represented ICMS codes from the analysis. (inset) The categories below the threshold and thus not included in the study.

The d dimensions encode concepts shared by all analysis[38, 39]. ConvNets utilises multiple convolu-
words, rather than a statistic relative to each unique tion kernels of different sizes to extract key informa-
word. This generally allows for richer word-word re- tion in sentences, which can capture the local rele-
lationship information than language representation of vance of text. A variant of CNNs, namely the Tem-
BoW models . Word vectors may either be initialized poral Convolutional Network (TCN) was recently pro-
randomly and trained along with machine learning mod- posed and has shown promising performance results
els on a specific text mining task, or can be pre-trained over standard CNN and RNN architectures on different
vectors. We evaluated both approaches, and used a pre- NLP benchmarks[40]. We evaluated a TCN architec-
trained word2vec[21] for word embeddings. ture, primarily as these—contrary to CNNs which can
only work with fixed-size text inputs and usually focus
To learn ICMS classes from possible contextual in-
on terms that are in immediate proximity due to their
formation provided in the BoQ, we further evaluated
static convolutional filter size—applies techniques like
a set of deep learning methods as shown in 3b, most
multiple layers of dilated convolutions and padding of
widely used as sequence processing models. In partic-
input sequences in order to handle different sequence
ular we evaluated two RNN (recurrent neural network)
lengths and capture dependencies between terms that
architectures; the Bidirectional LSTM, or BiLSTM, a
are not necessarily adjacent, but instead are positioned
bidirectional RNN consisting of a forward LSTM[22]
on different places in a sequence. This could potentially
unit and backward LSTM unit to enhance the ability
emphasise the strength of signal which can be dispersed
of neural networks to capture context information, and
in a given BoQ sequence. E.g. (from previous exam-
a simpler model of BiGRU[23], or bidirectional gated
ples): “Take down and remove to tip off Site unlit traf-
unit consisting of the output state connection layer of
fic sign including 4 posts.”, regardless of terms’ prox-
forward GRU[24], reverse GRU, and forward and re-
imity.
verse GRU. Both models make up for the basic RNN
architecture — which additionally are known to be no- 2.2.1. Experiments and Model Description
toriously difficult to train[25, 26] — in extracting global
Associating the samples provided in the BoQs to
features from text sequence, and have been widely
ICMS categories can be learned by machine learning
used with notable improvement over basic LSTM and
methods for classification, casting the task as a result to
GRU architectures in a range of applications to text and
a supervised learning problem for natural language pro-
speech processing[27, 28, 29, 30, 31].
cessing. For the studied dataset, S = {s1 , s2 , s3 , ..., sn },
We additionally evaluated a convolutional neural net- where s1 , s2 , ..., sn are the short texts provided in the
work (CNN)[32], which has been applied to model independent BoQs and |S | = 51906, the correspond-
sequences for decades, and more recently at tasks ing ICMS categories y1 , y2 , y3 , ..., ym are provided as
for text classification—e.g., sentence classification[33, ground-truth labels. Each BoQ item, si , is associated
34], document classification[35, 36, 37] and sentiment to a unique ICMS category, yi ∈ Y, where |Y| = 32.
5
News corpus containing 100 billion words)5 . We use
a 300-dimension word embedding for both the learned
w1 w2 w3 w4 wn ICMS1

Learning model
ICMS2 and pre-trained. The learning process therefore con-
pre-processing

. .
BoW . .
. .
.
sists of first, the semantic representation of each text
.
. . j .
ICMS
. is obtained through the model training (or using pre-
6045 . .
training & . . trained skip-gram model), and the vector representation
validation . .
data
.
ICMSn of words is obtained. Subsequently the vector represen-
tation of the word is input into each model for further
analysis and extraction of semantics. The final word
(a) Pipeline 1 examining the performance of different classification models with vector is then connected to a softmax layer of size 32
bag-of-words for language representation.
for text classification.
input layer embedding layer
For all neural network models, including the MLP,
1 we use ADAM[44] for learning, with a learning rate of
1
0.01. The batch size is set to 64. The training epochs are
ICMS1
set to 40. We employed the BiLSTM model as in[45],
Learning model

ICMS2
. .
.
w1 w2 w3 w4 wn … … .
.
.
.
. j
with two hidden layers of size 64. The BiGRU model
ICMS

.
.
.
is similarly trained with two hidden layers of the same
training &
validation
… . .
. . size. A dropout rate of 0.5 is applied to both. All mod-
data .
ICMSn
300
els in Pipeline 2 were implemented using Keras 6 and
16800 TensorFlow 7 .
The TCN uses 1D CNN layer, followed by two lay-
(b) Pipeline 2 examining the performance of deep learning models with a word em-
bedding layer. ers of dilated 1D convolution. We apply an exponential
dilation d = 2i for layer i in the network. Acasual con-
Figure 3: Schematic view of the two modeling pipelines; (a) with uni- volutions are applied in the TCN so that target labels
gram representation of the BoW model. The total number of unique
terms is reduced to 6045 after applying stemming, lemmatization and
can be learnt as a function of terms at any time step in
stop-word removal. (b) an embedding layer is applied to learn a word the sequence –contrary to the causal convolution that is
representation of 300 dimensions (d = 300 and |V| = 16800), or oth- used in Wavenet[46]– with kernel size set to 3 where
erwise using a pre-trained word-embedding (see details on the the use each layer uses 100 filters.
of Word2Vec).
The dataset is split into a training and validation set
(development set) of 80% of the entire corpus to train
To evaluate how the performance of classification and fine-tune the models, and a test set of 20% (result-
models based on both language representation ap- ing in 10242 samples) to evaluate the different models’
proaches compare, especially given the unique charac- performance. The models were fine-tuned optimising
the categorical cross-entropy l = − C=32
P
teristics of language use in the BoQs; short (of uneven c=1 y si ,c log(p s,c ),
size), application-specific and predominantly descrip- where p is the predicted probability observation s is of
tive, we evaluated classification methods in two differ- class c, of total 32 ICMS classes in the dataset.
ent experimental settings as shown in Fig. 3.
In Pipeline 1, we trained and fine-tuned Support Vec- 2.3. Results and Analysis
tor Machine (SVM)3 [41], Random Forest3 [42, 43] and We used TF-IDF with a Multinomial Naı̈ve
multilayer perceptron (MLP)4 algorithms. Bayes[47] as a baseline model. Count Vectorizer and
The MLP –fully connected feed-forward neural uni-gram model with feature set size of 6045 was used.
network– is trained with an input layer of size 300, an Similarly, the CNN of[34] –a classic baseline for text
additional 50-sized hidden layer and a softmax output classification – based on the pre-trained word embed-
layer of 32. ding is additionally used as a baseline model.
In Pipeline 2 we evaluated the three models of BiL-
STM, BiGRU and TCN, each applying two differ-
ent word embeddings as in[21]; learned vectors in an
embedding layer with random initial weights, or pre-
trained Word2Vec embeddings (trained on the Google
5 https://code.google.com/p/word2vec/
3 The model was built using Scikit-learn: scikit-learn.org 6 F. Chollet. Keras. https://github.com/fchollet/ keras, 2015
4 The model was built using TensorFlow: tensorflow.org 7 Software available from tensorflow.org

6
Pipeline Model Embedding Accuracy Macro F1 The feed-forward MLP outperforms all models,
NB BoW 0.861 0.857 achieving ≥ 90% F1 score on 25 ICMS categories,
e1

RF BoW 0.922 0.918 which again confirms the suggestion that despite the
lin

SVM BoW 0.863 0.860 presence of a semantic component in a subset of BoQs

pe
Pi

MLP BoW 0.932 0.930

language, simpler models can capture underlying struc-
CNN pre-trained 0.898 0.837 ture from key terms pertinent in their predominantly
BiGRU pre-trained 0.911 0.907
BiGRU trained 0.926 0.913
descriptive text.
e2

The classification performance on the three ICMS

lin

BiLSTM pre-trained 0.903 0.898

BiLSTM trained 0.919 0.907 classes below8 :

TCN pre-trained 0.915 0.902

TCN trained 0.929 0.923 • 1.08.060 : “Preliminaries — Constructor’s site
overheads — general requirements: Other tempo-
Table 1: Classification performance report of the different models. rary facilities and services”
The optimised and recorded RF is of 600 trees size with the uni-gram
BoW encoding (see details in Fig.5 )
• 1.08.020 : “Preliminaries — Constructor’s site
overheads — general requirements: Temporary ac-
cess roads and storage areas, traffic management
A synopsis of the results on both the accuracy and the
and diversion (at the Constructors’ discretion)”
macro F1 score for all the presented models on the test
set is presented in Table 1. • 1.05.080 : “Services and equipment: Control sys-
The results strongly suggest that most models are tems and instrumentation”
largely skillful at inferring ICMS standards from the
short text provided in the BoQs. Simpler models, ad- was lower than expected given the amount of train-
ditionally, like the generic TCN architecture with ba- ing samples relative to these categories in the dataset
sic fine-tuning outperforms established recurrent archi- (see Fig. 2). Albeit still reasonably accurate, as
tectures at this task. This can also be observed with models generally achieved above 80% F1 score, they
the MLP of one hidden layer based on a vector space under-performed on these categories with respect to
model for language representation coming on top. It their overall (mean) performance. We observed that
is apparent that information is notably embedded in lo- many of these classes were either of a broader na-
cal key features in the examined short text of BoQs, ture (e.g., 1.08.060) or seemed more context-dependent
with marginal signal provided contextually. The per- (e.g., 1.08.020) or both broad and context-dependent as
formance of the Random Forest can in fact emphasise in 1.05.080, which additionally included samples with
this further; with 600 bootstrapped samples of the origi- considerable reference to equipment-related jargon and
nal text and a random subset of terms set to 12 (log2 |V|, services with varying levels of detail; from descriptions
where |V| = 6045) to induce each classification tree, the of one word length like “spares”, “cabinet”, “CCTV” to
results are very comparable, and even showed to out- more informative descriptions like “Loop detector in-
perform those of recurrent architectures, implying that stallation type [number] lane in main carriageway at
inference from BoQs is effectively achievable. In fact [location] [number]” and “Following a detailed assess-
the choice for the RF was due to their simplicity rela- ment of [location] and sight lines to the new Entry Slip
tive to deep neural networks, and to their ability to han- Signals a number of changes will need to be made to the
dle imbalance (and noise) in data through resampling, entry slips of the scheme”.
and randomised selection of terms in text relative to the Many of these descriptions additionally overlapped
multiple induced classification trees. Its performance across the different codes, for example the terms
was nearly as accurate as best performing model on the “CCTV” and “control room/site” used in nearly the
dataset, only outperformed by the the MLP and TCN. same wording appeared frequently in both 1.08.020 and
The F1 scores corresponding to each ICMS category 1.08.060 categories, which illustrates how complex, if at
recorded on the test set for the best performing models all possible, is to classify these items, especially where
is reported in Fig. 4. Again, results are closely the difference lays in contextual information which is
comparable exhibiting a high performing inference not provided.
skill overall, with performance on most ICMS classes
considered in this study (≥ 21 of 32 total) above 90%
accuracy. 8 For a full list of ICMS definitions reader is advised to refer to

[10].

7
100
TCN
Score[%]

90
80
70

100
RF
Score[%]

90
80
70

100
BIGRU
Score[%]

90
80
70

100
MLP
Score[%]

90
80 Over 90%
Below 90%
70
1.04.040
1.06.020
1.08.060
1.06.050
1.01.060
1.08.010
1.02.050
1.02.030
1.05.080
1.08.020
1.02.080
1.04.070
1.07.040
1.04.050
1.05.060
1.03.030
1.02.020
1.03.070
1.01.010
1.05.020
1.01.050
1.07.030
1.03.090
1.02.060
1.02.070
1.06.010
1.08.040
1.08.090
1.05.030
1.03.060
1.05.040
1.01.020
ICMS codes

Figure 4: F1 score per ICMS category recorded for the TCN, RF, BiGRU and MLP on the test set. Black dots mean over 90%, white dots are below
90% F1 score. MLP (bottom) achieves a ≥ 90% F1 Score on 25 ICMS categories, compared to the rest of models with similar performance on
about 22 categories.

Here, a permanent CCTV is presumably of one class, Alternatively in absence of richer (and potentially
and a temporary one is of another, whereas neither the larger) datasets, methods for data augmentation in NLP
words “permanent” nor “temporary” were necessarily (e.g., token-level perturbation like EDA [49], mis-
present. That is, in order to improve inference beyond classified samples augmentation [50] and techniques
this point more training data of diverse contextual nature for under and oversampling like SMOTE [51] and
has to be provided, and subsequently modelled. MLSMOTE[52], among others), which have shown im-
proved performance on many text classification tasks,
On the other hand, the under-performance of the dif- could be potentially applied here9 . In this study
ferent models observed on 4 to 7 categories consistently however, only the bootstrapping of the random forest
below %90 F1 score is partly caused by the same rea- was applied as overall classification performance was
sons stated earlier, amplified by the long-tailed distri- largely up to the mark.
bution of samples across the 32 ICMS standards con- All models where tuned and optimised experimen-
sidered in the dataset. The majority of these categories tally. The reported performance of the SVM and Multi-
happen to be significantly under-represented in the orig- nomial NB correspond to their best models tuned with
inal dataset, and many of them stand only slightly above cross-validation (K-fold) on the development set. For
the 250-samples cutoff which was applied. This is to be the Random Forest we tuned the models on the de-
compared to the mean of about 1600 samples per class velopment set to minimise the estimated out-of-bag
in the dataset. Highly skewed datasets, where the mi- (OOB)[53] error as provided in Fig. 5, which showed
nority classes are heavily outnumbered by one or more noticeable convergence of performance towards a size
classes, have proven to be a challenge while at the same of 600 classification trees. We additionally report both
time becoming more and more common [48]. A con- the Precision and Recall scores corresponding to each
servative solution to this conundrum has been to under- ICMS category recorded on the test set for the optimal
sample by deleting the very minority classes as done RF in Fig.6, separately. 10
with classes of less than 250 samples. Although we ap-
plied this limit relatively arbitrarily, it has been set as a
trade-off between the classification of a larger number 9 For a comprehensive review of data augmentation methods in
of ICMS categories on the one hand, and model stabil- NLP reader is advised to refer to [48].
ity on another. 10 The performance of RF of 600 trees is reported here. We provide

8
As was foreseeable this again shows some arguably models as well as that of their learning. Learning rate
peripheral under-performance of the model on instances and drop-out rates were fixed as reported earlier. Most
of under-represented categories as described earlier. models showed similar learning performance (and loss
Despite a better performance that has been achieved on minimisation rate) over the training epochs as shown in
these particular samples by the different deep learning Fig. 7, and were able to converge at 15 to 20 training
models, compared to the RF, it can be overenthusias- epochs. Though again more complex models, e.g., BiL-
tic to draw conclusive arguments as to why that was, STM, whilst converging nearly similarly to the rest of
nonetheless. models, seem more prone to over-fitting, exhibiting con-
siderable difference between training and validation loss
0.094
RF,'sqrt', B. over the successive learning process, and are as such
0.092 RF,'sqrt', WS.
RF,'log2', B. sub-optimally adjusted.
0.090 RF,'log2', WS.
OOB Error Rate

1.00
0.088
1.4
0.086 0.95
1.2
0.084
1.0 0.90
0.082

Accuracy
0.8
Loss
0.85
0.080 MLP
100 200 300 400 500 600 700 800 0.6
Nº Estimators 0.80
BiLSTM
0.4 BiGRU
TCN
Figure 5: The out-of-bag (OOB) of the different RF configurations on 0.75 Train
0.2
the development set. The models with less number of random features Validate
0.0 0.70
at each classification tree, corresponding to log2 |V|, show better skill 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Epoch Epoch
at the inference task. Similarly a slight improvement can be achieved
when the warm-start (WS) hyper-parameter is activated. WS is a pa-
Figure 7: Learning, validation and loss performance for the different
rameter provided by some Scikit-models to allow existing fitted model
ANN architectures on the BoQ dataset. Whilst all models learning
attributes to initialise a new model in a subsequent call to fit.
converges quickly in training over the first 10 epochs, approximately
asymptotic thereafter, the TCN and MLP seem to exhibit less vari-
ance with better accuracy performance overall. Recurrent architec-
100
RF tures, and especially the BiLSTM, appear to have highest variance on
95
the dataset as a result of learning a much larger set of parameters.
90
There is a marginal improvement in performance as
Score[%]

85
80 Precision word embedding is learned by the models over us-
Recall
75
Over 90% ing pre-trained vectors, although at some computational
70 Below 90%
90% line cost, as models using pre-trained vectors were relatively
65
faster to train. Consistently nonetheless these models
1.04.040
1.06.020
1.08.060
1.06.050
1.01.060
1.08.010
1.02.050
1.02.030
1.05.080
1.08.020
1.02.080
1.04.070
1.07.040
1.04.050
1.05.060
1.03.030
1.02.020
1.03.070
1.01.010
1.05.020
1.01.050
1.07.030
1.03.090
1.02.060
1.02.070
1.06.010
1.08.040
1.08.090
1.05.030
1.03.060
1.05.040
1.01.020

ICMS codes
showed higher loss on validation instances. Although
the corpus used to train models can be deemed suffi-
Figure 6: Precision and Recall scores of the RF on the 32 ICMS cat- ciently large, quantitatively, it’s however less so seman-
egories. Although the average precision is very high, the algorithm is tically, especially due to its descriptive and short nature,
more sensitive to small data samples than the neural network -based and the considerable presence of specialised, and occa-
models. Compare to figure 4
sionally non-English, language. This can explain the
marginal edge achieved by learning an embedding vec-
Different configurations for the different ANN mod- tor for language representation on this corpus compared
els used in this study were evaluated. The best model to a pre-trained one, consistent with the conclusions in
was saved and their performance was reported on the [54] on the benefits of learning word embeddings for the
test set. The criteria for initial selection as candidate construction domain.
options included their reported performance in a wide In general, the experimental results indicate an effec-
range of language processing applications and bench- tive high inference skill of all ANN architectures on this
marks, whereas the architecture parameters where op- task, with comparable results additionally available with
timised relative to the classification performance of the RFs. In fact due to the specific nature of language use in
the BoQs; short, descriptive and technical, simpler mod-
access to the trained model in production alongside the implementa- els achieved better accuracy performance. Both MLP
tion of the evaluated models in this study. and TCN showed to be able to outperform other—more
9
sophisticated—methods. As mentioned earlier, the po- nature of text found in the BoQs. That is, they are con-
sition in a text is only important once the context is in- siderably condensed, short and strictly descriptive, so
ferred from the text. In the case of short texts, context much so that their complexity strikes as being a function
simply isn’t provided, and can only be inferred by ex- of abstraction in key term use, rather than the inherent
perts by looking at other variables or based on previous complexity of semantic dynamics in language use more
knowledge of the project, most of which is not mod- often than not. The “long memory” advantage of RNNs
elled. The “more-flexible-memory” advantage of RNNs is therefore largely inconsequential at this task, and as a
is therefore largely inconsequential at this task, and as result the TCN exhibited comparable memory to recur-
a result the TCN exhibited comparable memory to re- rent architectures with the same capacity. And simpler
current architectures with the same capacity. It also has models like the MLP and RF were able to capture the
a very small number of parameters compared to the Bi- required mapping favourably from local key features.
GRU and BiLSTM networks, and as the texts are too As adoption of ICMS gains traction, more annotated
simple to make use of this added complexity, these mod- data will be made available and the evaluated models
els tend to overfit and comparatively underperform. can be re-trained to learn further ICMS categories. It is
therefore hoped that the findings of this study will trig-
3. Conclusion and Impact ger this process further. To that end, the trained model
of MLP and development code in this study are made
This work presents the first attempt to automate the available to the community and can be readily used 11 .
(still manually-handled) mapping of free written work Consequently, we believe this study presents a com-
and items’ cost text descriptions, from construction cost pelling case for the community – both private and pub-
documents called bills of quantities (BoQs), into the In- lic sectors – of the construction industry to prioritise an
ternational Cost Measurement Standard (ICMS), which open data approach along their supply lines, apace with
will enable benchmarkers to compare and benchmark considerable use of tools to ensure friction-less stan-
the performance of projects at a scale that was never dardisation. We argue this will allow for vital develop-
done before, and facilitate more effective cost and risk ments in the field leading to a transformative automated
analysis in construction projects. To that end we evalu- benchmarking system.
ated state-of-the-art machine learning methods to learn
multi-class text classification models from 51906 item
descriptions. These were retrieved from 24 different Acknowledgements
infrastructure construction projects carried out by con-
tractors of public-owned companies of the United King- This work has been supported by Innovate UK under
dom, across the UK. Grant N: 08027517 as a part of “Transport infrastruc-
We considered two approaches to our modelling, one ture efficiency strategy living labs” (TIES Living Labs)
assuming information signals can be captured from lo- Project N. 106171.
cal features of the description text provided in the BoQs,
and another on the premise that, alongside local key References
features, the potential propagation of information and
semantics in the text may help improve the learning of [1] N. Thompson, W. Squires, N. Fearnhead, R. Claase, Digitali-
ICMS codes. To do that we evaluated a range of classifi- sation in construction-industrial strategy review, supporting the
cation methods which have been widely used on tasks of government’s industrial strategy, Tech. rep., University College
London, London (2017).
text classification, including support vector machines, [2] N. Davies, G. Atkins, D. Slade, How to transform infrastructure
random forests, multi-layer perceptron, and advanced decision making in the UK, Tech. rep., Institute for Government,
deep learning architectures commonly used in sequence London (2018).
modelling, including recurrent (LSTM, GRU) and con- [3] S. Changali, A. Mohammad, M. v. Nieuwland, The construction
productivity imperative, McKinsey, 2015.
volutional architectures (CNN, TCN). [4] W. Pan, A. G. Gibb, A. R. Dainty, Leading UK housebuilders’
Whilst results strongly suggest that most models are utilization of offsite construction methods, Building Research &
largely skillful at inferring ICMS standards from the Information 36 (1) (2008) 56–67.
[5] P. Fewings, C. Henjewele, Construction project management:
short text provided in the BoQs, we found that sim-
an integrated approach, Routledge, 2019.
pler models, like the RF, and generic MLP and TCN
architectures with minimal tuning outperform recurrent
–more sophisticated– architectures such as LSTMs and 11 Operational model (MLP) and development code are available

GRUs. This is likely due to the “straight-to-the-point” on: https://github.com/ignaciodeza/BoQ-classifier-ICMS

10
[6] X. Yin, H. Liu, Y. Chen, M. Al-Hussein, Building information [23] X. Luo, W. Zhou, W. Wang, Y. Zhu, J. Deng, Attention-
modelling for off-site construction: Review and future direc- based relation extraction with bidirectional gated recurrent unit
tions, Automation in Construction 101 (2019) 72–91. and highway network in the analysis of geological data, IEEE
[7] M. El Jazzar, M. Piskernik, H. Nassereddine, Digital twin in Access 6 (2018) 5705–5715. doi:10.1109/ACCESS.2017.
construction: An empirical analysis, in: EG-ICE 2020 Work- 2785229.
shop on Intelligent Computing in Engineering, Proceedings, [24] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau,
2020, pp. 501–510. F. Bougares, H. Schwenk, Y. Bengio, Learning phrase repre-
[8] S. Wu, K. Ginige, G. Wood, S. W. Jong, et al., How can building sentations using RNN encoder–decoder for statistical machine
information modelling (BIM) support the new rules of measure- translation, in: Proceedings of the 2014 Conference on Empiri-
ment (NRM1), Tech. rep., Royal Institution of Chartered Sur- cal Methods in Natural Language Processing (EMNLP), Doha,
veyors (2014). Qatar, 2014, pp. 1724–1734. doi:10.3115/v1/D14-1179.
[9] A. Muse, M. Horner, G. O’Sullivan, C. Fry, A. Aronsohn, [25] Y. Bengio, P. Simard, P. Frasconi, Learning long-term depen-
D. Baharuddin, P. Bredehoeft, T. Chatzisymeon, R. Fadason, dencies with gradient descent is difficult, IEEE Transactions on
R. Flanagan, et al., ICMS: Global Consistency in Presenting Neural Networks 5 (2) (1994) 157–166. doi:10.1109/72.
Construction Life Cycle Costs and Carbon Emissions (2021). 279181.
[10] C. Mitchell, International construction measurement standards [26] R. Pascanu, T. Mikolov, Y. Bengio, On the difficulty of train-
(ICMS) explained, Tech. rep., International Construction Mea- ing recurrent neural networks, in: S. Dasgupta, D. McAllester
surement Standards Coalition (ICMSC). (Eds.), Proceedings of the 30th International Conference on Ma-
URL https://icms-coalition.org/,year={2016} chine Learning, Vol. 28 of Proceedings of Machine Learning
[11] M. D. Deo Prasad, A. Kuru, P. Oldfield, L. Ding, C. Noller, Research, Atlanta, Georgia, USA, 2013, pp. 1310–1318.
B. He, Race to net zero carbon: A climate emergency guide for [27] J. Chen, Y. Hu, J. Liu, Y. Xiao, H. Jiang, Deep short text classifi-
new and existing buildings in Australia, Tech. rep., Low Carbon cation with knowledge powered attention, in: Proceedings of the
Institute (2021). Thirty-Third AAAI Conference on Artificial Intelligence and
[12] TIES living lab, https://tieslivinglab.co.uk/, [Online; Thirty-First Innovative Applications of Artificial Intelligence
accessed 8-July-2022] (2022). Conference and Ninth AAAI Symposium on Educational Ad-
[13] X. Li, H. Xie, L. Chen, J. Wang, X. Deng, News impact on stock vances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19,
price return via sentiment analysis, Knowledge-Based Systems AAAI Press, 2019, p. 8. doi:10.1609/aaai.v33i01.
69 (2014) 14–23. 33016252.
[14] J. Y. Lee, F. Dernoncourt, Sequential short-text classification [28] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation
with recurrent and convolutional neural networks, in: Proceed- by jointly learning to align and translate, in: Y. Bengio, Y. Le-
ings of the 2016 Conference of the North American Chapter of Cun (Eds.), 3rd International Conference on Learning Repre-
the Association for Computational Linguistics, 2016, pp. 515– sentations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015,
520. Conference Track Proceedings, 2015, p. 9.
[15] R. Fellows, H. Ihshaish, S. Battle, C. Haines, P. Mayhew, J. I. [29] T. Zhang, R. Xu, Performance Comparisons of Bi-LSTM and
Deza, Task-oriented dialogue systems: performance vs. quality- Bi-GRU Networks in Chinese Word Segmentation, Association
optima, a review, in: David C. Wyld et al. (Eds): SIPP, NLPCL, for Computing Machinery, New York, NY, USA, 2021, Ch. 3, p.
BIGML, SOEN, AISC, NCWMC, CCSIT, 2022, pp. 69–87. 73–80.
doi:10.5121/csit.2022.121306. [30] V. Vukotić, C. Raymond, G. Gravier, A step beyond local obser-
[16] R. Nicholls, R. Fellows, S. Battle, H. Ihshaish, Problem classi- vations with a dialog aware bidirectional gru network for spoken
fication for tailored helpdesk auto-replies, in: Artificial Neural language understanding, in: Interspeech, 2016, pp. 3241–3244.
Networks and Machine Learning – ICANN 2022, Springer Na- doi:10.21437/Interspeech.2016-1301.
ture Switzerland, Cham, 2022, pp. 445–454. doi:10.1007/ [31] Z. Xiao, P. Liang, Chinese sentiment analysis using bidirectional
978-3-031-15937-4\_37. lstm with word embedding, in: X. Sun, A. Liu, H.-C. Chao,
[17] J. Hu, G. Wang, F. Lochovsky, J.-t. Sun, Z. Chen, Understanding E. Bertino (Eds.), Cloud Computing and Security, Springer In-
user’s query intent with wikipedia, in: Proceedings of the 18th ternational Publishing, Cham, 2016, pp. 601–610.
international conference on World wide web, 2009, pp. 471– [32] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard,
480. W. Hubbard, L. D. Jackel, Backpropagation applied to hand-
[18] G. Salton, C. Buckley, Term-weighting approaches in auto- written zip code recognition, Neural Computation 1 (4) (1989)
matic text retrieval, Information Processing & Management 541–551. doi:10.1162/neco.1989.1.4.541.
24 (5) (1988) 513–523. doi:https://doi.org/10.1016/ [33] N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional
0306-4573(88)90021-0. neural network for modelling sentences, in: Proceedings of the
[19] Y. Bengio, R. Ducharme, P. Vincent, C. Janvin, A neural prob- 52nd Annual Meeting of the Association for Computational Lin-
abilistic language model, J. Mach. Learn. Res. 3 (null) (2003) guistics (Volume 1: Long Papers), Association for Computa-
1137–1155. tional Linguistics, Baltimore, Maryland, 2014, pp. 655–665.
[20] G. Salton, C. Buckley, Term-weighting approaches in auto- doi:10.3115/v1/P14-1062.
matic text retrieval, Information Processing & Management [34] Y. Kim, Convolutional neural networks for sentence classifi-
24 (5) (1988) 513–523. doi:https://doi.org/10.1016/ cation, in: Proceedings of the 2014 Conference on Empirical
0306-4573(88)90021-0. Methods in Natural Language Processing (EMNLP), Associ-
[21] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Dis- ation for Computational Linguistics, Doha, Qatar, 2014, pp.
tributed representations of words and phrases and their composi- 1746–1751. doi:10.3115/v1/D14-1181.
tionality, in: C. Burges, L. Bottou, M. Welling, Z. Ghahramani, [35] A. Conneau, H. Schwenk, L. Barrault, Y. Lecun, Very deep
K. Weinberger (Eds.), Advances in Neural Information Process- convolutional networks for text classification, in: Proceedings
ing Systems, Vol. 26, Curran Associates, Inc., 2013, p. 9. of the 15th Conference of the European Chapter of the Asso-
[22] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural ciation for Computational Linguistics, 2017, pp. 1107–1116.
computation 9 (8) (1997) 1735–1780. doi:10.18653/v1/E17-1104.

11
[36] R. Johnson, T. Zhang, Effective use of word order for text cat- Language Processing and the 9th International Joint Conference
egorization with convolutional neural networks, in: Proceed- on Natural Language Processing (EMNLP-IJCNLP), Associa-
ings of the 2015 Conference of the North American Chapter tion for Computational Linguistics, Hong Kong, China, 2019,
of the Association for Computational Linguistics: Human Lan- pp. 6382–6388. doi:10.18653/v1/D19-1670.
guage Technologies, 2015, pp. 103–112. doi:10.3115/v1/ [50] T. Dreossi, S. Ghosh, X. Yue, K. Keutzer, A. Sangiovanni-
n15-1011. Vincentelli, S. A. Seshia, Counterexample-guided data augmen-
[37] R. Johnson, T. Zhang, Deep pyramid convolutional neural net- tation, in: Proceedings of the 27th International Joint Confer-
works for text categorization, in: Proceedings of the 55th An- ence on Artificial Intelligence, IJCAI’18, AAAI Press, 2018, p.
nual Meeting of the Association for Computational Linguis- 2071–2078.
tics (Volume 1: Long Papers), Association for Computational [51] N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer,
Linguistics, Vancouver, Canada, 2017, pp. 562–570. doi: Smote: synthetic minority over-sampling technique, Journal of
10.18653/v1/P17-1052. artificial intelligence research 16 (2002) 321–357.
[38] X. Ouyang, P. Zhou, C. H. Li, L. Liu, Sentiment analysis us- [52] F. Charte, A. J. Rivera, M. J. del Jesus, F. Herrera, Mlsmote:
ing convolutional neural network, in: 2015 IEEE International Approaching imbalanced multilabel learning through synthetic
Conference on Computer and Information Technology; Ubiqui- instance generation, Knowledge-Based Systems 89 (2015) 385–
tous Computing and Communications; Dependable, Autonomic 397. doi:https://doi.org/10.1016/j.knosys.2015.
and Secure Computing; Pervasive Intelligence and Comput- 07.019.
ing, 2015, pp. 2359–2364. doi:10.1109/CIT/IUCC/DASC/ [53] L. Breiman, Out-of-bag estimation, Tech. rep., Dept. of Statis-
PICOM.2015.349. tics, Univ. of California Berkeley (1996).
[39] S. Liao, J. Wang, R. Yu, K. Sato, Z. Cheng, Cnn for situa- URL www.stat.berkeley.edu/~breiman/
tions understanding based on sentiment analysis of twitter data, OOBestimation.pdf
Procedia Computer Science 111 (2017) 376–381, the 8th Inter- [54] A. J. P. Tixier, M. Vazirgiannis, M. R. Hallowell, Word em-
national Conference on Advances in Information Technology. beddings for the construction domain (2016). doi:10.48550/
doi:https://doi.org/10.1016/j.procs.2017.06.037. ARXIV.1610.09333.
[40] S. Bai, J. Z. Kolter, V. Koltun, An empirical evaluation of
generic convolutional and recurrent networks for sequence mod-
eling, ArXiv abs/1803.01271 (2018).
[41] B. E. Boser, I. M. Guyon, V. N. Vapnik, A training algorithm
for optimal margin classifiers, in: Proceedings of the Fifth An-
nual Workshop on Computational Learning Theory, COLT ’92,
Association for Computing Machinery, New York, NY, USA,
1992, p. 144–152. doi:10.1145/130385.130401.
[42] L. Breiman, Random forests, Machine Learning 45 (1) (2001)
5–32. doi:10.1023/A:1010933404324.
[43] T. K. Ho, Random decision forests, in: Proceedings of 3rd in-
ternational conference on document analysis and recognition,
Vol. 1, IEEE, 1995, pp. 278–282.
[44] D. P. Kingma, J. Ba, Adam: A method for stochastic optimiza-
tion, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Con-
ference on Learning Representations, ICLR 2015, San Diego,
CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015,
p. 15.
[45] Y. Hao, Y. Zhang, K. Liu, S. He, Z. Liu, H. Wu, J. Zhao, An end-
to-end model for question answering over knowledge base with
cross-attention combining global knowledge, in: Proceedings of
the 55th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), Vancouver, Canada, 2017,
pp. 221–231. doi:10.18653/v1/P17-1021.
[46] A. van den Oord, S. Dieleman, H. Zen, K. Simonyan,
O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior,
K. Kavukcuoglu, Wavenet: A generative model for raw audio,
in: Arxiv, 2016, p. 15.
URL https://arxiv.org/abs/1609.03499
[47] A. M. Kibriya, E. Frank, B. Pfahringer, G. Holmes, Multi-
nomial naive bayes for text categorization revisited, in: G. I.
Webb, X. Yu (Eds.), AI 2004: Advances in Artificial Intelli-
gence, Springer Berlin Heidelberg, 2005, pp. 488–499.
[48] S. Y. Feng, V. Gangal, J. Wei, S. Chandar, S. Vosoughi, T. Mi-
tamura, E. Hovy, A survey of data augmentation approaches
for NLP, in: Findings of the Association for Computational
Linguistics: ACL-IJCNLP 2021, 2021, pp. 968–988. doi:
10.18653/v1/2021.findings-acl.84.
[49] J. Wei, K. Zou, EDA: Easy data augmentation techniques for
boosting performance on text classification tasks, in: Proceed-
ings of the 2019 Conference on Empirical Methods in Natural

Presentation-Applying International Construction Measurement Standards (ICMS)
No ratings yet
Presentation-Applying International Construction Measurement Standards (ICMS)
17 pages
Understanding International Construction Standards
No ratings yet
Understanding International Construction Standards
22 pages
ICMS Explained User Guide RICS 011217
No ratings yet
ICMS Explained User Guide RICS 011217
36 pages
Specification by Bezawork
No ratings yet
Specification by Bezawork
93 pages
Course Part 1 Introduction
No ratings yet
Course Part 1 Introduction
87 pages
Lombardi+Advances in Construction and Project Management
No ratings yet
Lombardi+Advances in Construction and Project Management
449 pages
BIM and Big Data For Construction Cost Management (PDFDrive)
No ratings yet
BIM and Big Data For Construction Cost Management (PDFDrive)
179 pages
Automation in Construction Cost Budgeting Using
No ratings yet
Automation in Construction Cost Budgeting Using
10 pages
BIM for Infrastructure Quality Management
No ratings yet
BIM for Infrastructure Quality Management
9 pages
Icms Basis Conclusions 200717 JF PDF
No ratings yet
Icms Basis Conclusions 200717 JF PDF
7 pages
Introduction To Estimating For Construction
No ratings yet
Introduction To Estimating For Construction
21 pages
Use of NRM2 Sep 2025
No ratings yet
Use of NRM2 Sep 2025
75 pages
Bim 1
No ratings yet
Bim 1
21 pages
Estimating - Preparation Including Costing
No ratings yet
Estimating - Preparation Including Costing
5 pages
BIM and Big Data For Construction Cost Management PDF
100% (5)
BIM and Big Data For Construction Cost Management PDF
179 pages
Building Information Modelling (BIM) Standardization: Martin Poljanšek
No ratings yet
Building Information Modelling (BIM) Standardization: Martin Poljanšek
27 pages
Objective Objective: Construction Technology Construction Management
No ratings yet
Objective Objective: Construction Technology Construction Management
2 pages
Igwe 2020 IOP Conf. Ser. Mater. Sci. Eng. 884 012041
No ratings yet
Igwe 2020 IOP Conf. Ser. Mater. Sci. Eng. 884 012041
21 pages
Infrastructurecostmanagement Overview Trend Futuredirection R SamuelEkung
No ratings yet
Infrastructurecostmanagement Overview Trend Futuredirection R SamuelEkung
25 pages
CH 4 Methods of Measurements
No ratings yet
CH 4 Methods of Measurements
20 pages
International Construction Cost Standards
100% (2)
International Construction Cost Standards
86 pages
Establishing Quality Standards in Construction
No ratings yet
Establishing Quality Standards in Construction
119 pages
Final Summary
No ratings yet
Final Summary
12 pages
Incorporating AI Into Construction Management Enhancing Efficiency and Cost
No ratings yet
Incorporating AI Into Construction Management Enhancing Efficiency and Cost
11 pages
EBQS3103 Measurement in Works Notes 1
No ratings yet
EBQS3103 Measurement in Works Notes 1
10 pages
Introduction To Quantity Surveying
No ratings yet
Introduction To Quantity Surveying
25 pages
Lec 1 Cost Est
No ratings yet
Lec 1 Cost Est
42 pages
Chen 2021 J. Phys. Conf. Ser. 1881 022036
No ratings yet
Chen 2021 J. Phys. Conf. Ser. 1881 022036
6 pages
Retraction: Retracted: Construction Project Cost Management and Control System Based On Big Data
No ratings yet
Retraction: Retracted: Construction Project Cost Management and Control System Based On Big Data
8 pages
BIM and Big Data For Construction Cost Management
No ratings yet
BIM and Big Data For Construction Cost Management
46 pages
OCW SBQ 2423 Lecture 2 PDF
No ratings yet
OCW SBQ 2423 Lecture 2 PDF
30 pages
Role of CS in Construction Industry
No ratings yet
Role of CS in Construction Industry
4 pages
Contineu Preread
No ratings yet
Contineu Preread
7 pages
Wk1 Lecture Presentation Introduction To The Module
No ratings yet
Wk1 Lecture Presentation Introduction To The Module
46 pages
Automated Construction Monitoring
No ratings yet
Automated Construction Monitoring
24 pages
Professional Standard For Quantity Surveying
100% (1)
Professional Standard For Quantity Surveying
9 pages
Digital Quality Management in Construction
No ratings yet
Digital Quality Management in Construction
275 pages
A System For Tender Price Evaluation of Construction Project Based On Big Data
No ratings yet
A System For Tender Price Evaluation of Construction Project Based On Big Data
9 pages
Case Study - BIM - and - Big - Data - For - Construction - Cost - Management
No ratings yet
Case Study - BIM - and - Big - Data - For - Construction - Cost - Management
46 pages
076 Modularconstruction 161029141747
No ratings yet
076 Modularconstruction 161029141747
36 pages
Data Science Challenges in Construction
No ratings yet
Data Science Challenges in Construction
48 pages
Cost Engineering Course Overview
No ratings yet
Cost Engineering Course Overview
10 pages
Cobscqscm211p 005 Assignment 1
No ratings yet
Cobscqscm211p 005 Assignment 1
11 pages
The Influence of Artificial Intelligence On Practices in Construction Cost Estimation
No ratings yet
The Influence of Artificial Intelligence On Practices in Construction Cost Estimation
8 pages
Bim. Standards
100% (1)
Bim. Standards
27 pages
SCSI ICMS Explained Final 1
No ratings yet
SCSI ICMS Explained Final 1
47 pages
Project Quality Management Critical
100% (2)
Project Quality Management Critical
186 pages
Bill of Quantities in Construction
No ratings yet
Bill of Quantities in Construction
37 pages
Lecture 8 - Construction Cost Estimates
100% (2)
Lecture 8 - Construction Cost Estimates
35 pages
The Research of Highway Construction Project Information Comprehensive Management System
No ratings yet
The Research of Highway Construction Project Information Comprehensive Management System
8 pages
Building Information Modeling For Quality Management in Infrastructure
No ratings yet
Building Information Modeling For Quality Management in Infrastructure
8 pages
Advanced Techniques
No ratings yet
Advanced Techniques
2 pages
Smart India Hackathon 2024
No ratings yet
Smart India Hackathon 2024
6 pages
Construction Cost Data Workbook
80% (5)
Construction Cost Data Workbook
29 pages
Internal Auditor
No ratings yet
Internal Auditor
3 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
10 pages
Span of Control Matters
No ratings yet
Span of Control Matters
5 pages
Machine Learning-Based Decision-Making Systems, Cloud Computing and Blockchain Technologies, and Big Data Analytics Algorithms in Accounting and Auditing
No ratings yet
Machine Learning-Based Decision-Making Systems, Cloud Computing and Blockchain Technologies, and Big Data Analytics Algorithms in Accounting and Auditing
19 pages
Academic Publishing, Part III: How To Write A Research Paper (So That It Will Be Accepted) in A High Quality Journal
No ratings yet
Academic Publishing, Part III: How To Write A Research Paper (So That It Will Be Accepted) in A High Quality Journal
7 pages
EGESO
No ratings yet
EGESO
7 pages
Matthew Desmond: Sociology CV Overview
No ratings yet
Matthew Desmond: Sociology CV Overview
30 pages
Slide QTCL
No ratings yet
Slide QTCL
21 pages
Blockchain's Impact on Buyer-Seller Trust
No ratings yet
Blockchain's Impact on Buyer-Seller Trust
25 pages
Systems Engineering Methodology Overview
No ratings yet
Systems Engineering Methodology Overview
3 pages
Medeiros Et Al 2024 Optimal Lima Bean Plot Size
No ratings yet
Medeiros Et Al 2024 Optimal Lima Bean Plot Size
8 pages
Ilham Romadona's Economics CV
No ratings yet
Ilham Romadona's Economics CV
6 pages
Ch2020 Tong Hop Slide Bai Giang 20.3m
No ratings yet
Ch2020 Tong Hop Slide Bai Giang 20.3m
672 pages
2) Chapter 4 Preparation
No ratings yet
2) Chapter 4 Preparation
16 pages
PD Assigment Online - Written
No ratings yet
PD Assigment Online - Written
25 pages
Buenaflor, Dimol, Mestiola
No ratings yet
Buenaflor, Dimol, Mestiola
75 pages
CV - Fadhlur Rahman
No ratings yet
CV - Fadhlur Rahman
4 pages
Assignment (Kashish Jain)
No ratings yet
Assignment (Kashish Jain)
10 pages
Harmonic Mean Mcqs Quiz Online: Bonus$Aver Promotion
No ratings yet
Harmonic Mean Mcqs Quiz Online: Bonus$Aver Promotion
6 pages
Primarily Data Collection Report
No ratings yet
Primarily Data Collection Report
3 pages
Data Science QP - 24 25
100% (1)
Data Science QP - 24 25
9 pages
Bluelearn: Empowering Student Growth
No ratings yet
Bluelearn: Empowering Student Growth
10 pages
Indonesia Salt Policy Effectiveness
No ratings yet
Indonesia Salt Policy Effectiveness
10 pages
Dispersed Manufacturing in Nigeria's Economy
No ratings yet
Dispersed Manufacturing in Nigeria's Economy
157 pages
Management Course: Budgeting Fundamentals
No ratings yet
Management Course: Budgeting Fundamentals
14 pages
Behavioral Investment RP Hriday
No ratings yet
Behavioral Investment RP Hriday
5 pages
A Study On Personal Effectiveness Among College Students: January 2021
No ratings yet
A Study On Personal Effectiveness Among College Students: January 2021
4 pages
Alareeni - 2019
No ratings yet
Alareeni - 2019
42 pages
Final Research
No ratings yet
Final Research
69 pages
Griha Manual Volume-1 (13-01-2011)
100% (2)
Griha Manual Volume-1 (13-01-2011)
124 pages

A Machine Learning Approach To Classifying Constru

Uploaded by

A Machine Learning Approach To Classifying Constru

Uploaded by

A Machine Learning Approach to Classifying Construction Cost Documents into

the International Construction Measurement Standard

J. Ignacio Dezaa,∗, Hisham Ihshaishb,∗, Lamine Mahdjoubia

a Centre for Architecture and Built Environment Research (CABER).

1. Introduction where each car was slightly different, done in slightly

∗ Corresponding Authors: J. I. Deza and H. Ihshaish (e-mail:

Preprint submitted to Engineering Applications of Artificial Intelligence November 16, 2022

SVM BoW 0.863 0.860 presence of a semantic component in a subset of BoQs

MLP BoW 0.932 0.930

The classification performance on the three ICMS

BiLSTM pre-trained 0.903 0.898

BiLSTM trained 0.919 0.907 classes below8 :

TCN pre-trained 0.915 0.902

GRUs. This is likely due to the “straight-to-the-point” on: https://github.com/ignaciodeza/BoQ-classifier-ICMS

You might also like