0% found this document useful (0 votes)
43 views43 pages

Fulltext01 7

The document discusses applying TensorFlow Lite on embedded devices by converting a TensorFlow model to a TensorFlow Lite model and deploying it on a smartphone. It compares the performance of the original model to the converted model. The author developed a process for designing, developing, evaluating and deploying machine learning models for devices with limited resources using TensorFlow and TensorFlow Lite. The models were compared in terms of size, prediction time and accuracy. The results showed that the TensorFlow Lite model had 60% smaller size, 70% less prediction time and was as accurate as the original model, demonstrating that machine learning integration on embedded devices is promising using the described process.

Uploaded by

Glonning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views43 pages

Fulltext01 7

The document discusses applying TensorFlow Lite on embedded devices by converting a TensorFlow model to a TensorFlow Lite model and deploying it on a smartphone. It compares the performance of the original model to the converted model. The author developed a process for designing, developing, evaluating and deploying machine learning models for devices with limited resources using TensorFlow and TensorFlow Lite. The models were compared in terms of size, prediction time and accuracy. The results showed that the TensorFlow Lite model had 60% smaller size, 70% less prediction time and was as accurate as the original model, demonstrating that machine learning integration on embedded devices is promising using the described process.

Uploaded by

Glonning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Application of TensorFlow lite on embedded

devices
A hands-on practice of TensorFlow model conversion to TensorFlow
Lite model and its deployment on Smartphone to compare model’s
performance.

Mitra Rashidi

Computer Engineering BA (C), Final Project


Main field of study: Computer Engineering
Credits: 15 hp Semester
Year: Spring, 2022
Supervisor: Karl Pettersson, [email protected]
Examiner: Patrik Österberg, [email protected]
Course code: DT099G
Programme: Computer Science, 180 credits
At Mid Sweden University, it is possible to publish the thesis in full text in DiVA (see
appendix for publishing conditions). The publication is open access, which means that
the work will be freely available to read and download online. This increases the
dissemination and visibility of the degree project.
Open access is becoming the norm for disseminating scientific information online. Mid
Sweden University recommends both researchers and students to publish their work
open access.

I/we allow publishing in full text (free available online, open access):
☒ Yes, I/we agree to the terms of publication.

☐ No, I/we do not accept that my independent work is published in the


public interface in DiVA (only archiving in DiVA).

Sundsvall 2202-06-23
................................................................................................................................
Location and date

Datateknik Kandidat / DT099G


................................................................................................................................
Programme/Course

Mitra Rashidi
................................................................................................................................
Name (all authors names)

1996
................................................................................................................................
Year of birth (all authors year of birth)

ii
Abstract
The thesis describes development of Machine learning (ML) integration procedure in
smartphone and provides a comparison with the traditional computer models like
TensorFlow. Machine learning is a field that promotes a lot of observation in the current
era due to its notable desire in various Intelligent applications such as computer vision,
natural language processing, recommendation systems, and time series problems. The
limitation of resources on a smartphone makes it challenging to apprehend varied
completely different activities with high precision. A user-friendly procedure is
proposed to perform the designing, development, evaluation and deployment of
Machine learning models for embedded devices with limited resources. TensorFlow
(TF) and TensorFlow lite (TF Lite) were selected to perform the task. The thesis
provides the procedure of a base Machine learning model designed for a computer,
laptop or a machine to a compressed and optimized version of the same model for
integration on a device with limited resources. The models were compared, and results
were obtained. It was found that the TensorFlow lite model is extremely favorable for
Machine learning integration in embedded devices. The storage of the developed model
file and the time taken for the prediction of the value was compared. The results showed
that the TensorFlow lite model was as accurate as the basic model, the size of the
TensorFlow lite model was 60% less than the size of the base model and the response
time of the TensorFlow lite model was 70% less than the base model. This showed that
the Machine learning integration to the embedded devices is promising with the
procedure proposed in the thesis. Finally, the model was deployed in the android smart
phone and its practicality and feasibility of use was showed. The framework adopts a
unique and reliable approach that provides flexibility while passing the challenge of
Machine learning integrated in the android device.

Keywords: Machine learning, TensorFlow, TensorFlow lite, Smartphone.

iii
Sammanfattning
Rapporten beskriver processutvecklingen av integration inom smart-enheter och gör
jämförelse med de traditionella datormodeller som exempelvis TensorFlow.
Maskininlärning är ett område som för närvarande observeras av många människor på
grund av dess anmärkningsvärda önskan inom olika intelligenta områden som
datorseende, naturlig språkbehandling, föreslagna datorsystem, etcetera. Problem med
utdata och tidsserier. Resursbegränsningen av smart-enheter gör det svårt att uppfatta
olika aktiviteter som helt annorlunda än en stor process. En rekommenderad
användarvänlig process för att utföra design, utveckling, utvärdering och leverans av
maskininlärning-modeller för resurs begränsade inbäddade enheter. TensorFlow och
TensorFlow Lite valdes ut för att utföra examensarbetet som tillhandahåller arbetsflödet
för en maskininlärning-modell designad för en dator, bärbar dator eller en komprimerad
och optimerad version av samma modell integrerad på en enda enhet har begränsade
resurser. Modellerna jämförs och resultaten erhålls. Resultatet det visar sig att
TensorFlow Lite-modellen är extremt starkt integrerad med maskininlärning i inbyggda
enheter Lagringen av den utvecklade modell filen, tid det tog för att förutsäga värdet
jämfördes. Resultaten visade att TensorFlow Lite-modellen var korrekt i jämförelse med
basmodellen, då storleken på TensorFlow Lite var 60 % mindre än basstorleken och
responstid för TensorFlow Lite-modellen var 70 % mindre än basmodellen. Detta visar
att det finns en möjlighet att integrera maskininlärning i enheter med den process som
föreslås i avhandlingen. Slutligen är modellen gjord på en Android-smartphone, dess
praktiska funktionalitet och genomförbarhet har visat. Ramverket har ett unikt och
pålitligt tillvägagångssätt, vilket ger flexibilitet samtidigt som det klarar utmaningen att
integrera i Android-enheter.

Nyckelord: Maskininlärning, TensorFlow, TensorFlow lite, Smartphone.

iv
Acknowledgements
I am extremely grateful to my supervisors, Karl Pettersson for his invaluable advice,
continuous support, and patience during my thesis. His immense knowledge and
creative vision have encouraged me in all the time of my research, implementation and
conclusion.
I would like to express my gratitude to my family, without their tremendous
understanding and encouragement in the past few weeks, it would be impossible for me
to complete my thesis.

v
Table of Contents
Abstract iii
Sammanfattning iv
Acknowledgements v
Terminology viii

1 Introduction ............................................................................................. 1
1.1 Background and problem motivation ....................................................... 1
1.2 Overall aim ............................................................................................... 2
1.3 Problem statement/Research questions ..................................................... 2
1.4 Knowledge goals....................................................................................... 3
1.5 Scope ......................................................................................................... 3
1.6 Outline ...................................................................................................... 3
2 Theory....................................................................................................... 4
2.1 Machine Learning ..................................................................................... 4
2.2 Transfer Learning ..................................................................................... 5
2.2.1 TensorFlow ................................................................................................ 5
2.2.2 Tensor ........................................................................................................ 6
2.2.3 Tensor Calculation .................................................................................... 7
2.2.4 TensorFlow Lite ........................................................................................ 7
2.3 Model optimization with TensorFlow ...................................................... 8
2.3.1 Quantization .............................................................................................. 8
2.3.2 Post-training quantization .......................................................................... 9
2.3.3 Pruning ...................................................................................................... 9
2.4 Embedded devices .................................................................................. 11
2.4.1 Micro controller ....................................................................................... 11
2.4.2 Smartphones ............................................................................................ 12
2.4.3 Raspberry pi ............................................................................................ 12
2.5 Environment............................................................................................ 12
2.5.1 Google Colab ........................................................................................... 12
2.6 Evaluation dependent Parameters ........................................................... 12
2.6.1 Accuracy of model .................................................................................. 12
2.6.2 Time of prediction ................................................................................... 12
2.6.3 Need of resources .................................................................................... 13
2.7 Related work ........................................................................................... 13
2.7.1 Design and optimization of a TF Lite on a smartphone .......................... 13

vi
2.7.2 Deploying Image Deblurring Across Mobile Devices ............................ 13
3 Methodology........................................................................................... 14
3.1 Scientific method description ................................................................. 14
3.2 Project method description ..................................................................... 15
3.2.1 Problem definition ................................................................................... 15
3.2.2 Literature study ........................................................................................ 15
3.2.3 Implementation ........................................................................................ 15
3.2.4 Measurement setup .................................................................................. 16
3.2.5 Evaluation ................................................................................................ 16
3.3 Evaluation method .................................................................................. 16
4 Implementation ...................................................................................... 17
4.1 Machine Learning model building using Keras ...................................... 17
4.2 Machine Learning model training and evaluation .................................. 19
4.3 TensorFlow lite Conversion and optimization........................................ 19
4.3.1 TensorFlow Lite Conversion ................................................................... 20
4.3.2 TensorFlow Lite Converter optimization ................................................ 20
4.4 TensorFlow lite Evaluation on Colab ..................................................... 20
4.5 TensorFlow Lite Model integration on Android smartphone ................. 22
5 Results..................................................................................................... 23
5.1 Resulting system ..................................................................................... 23
5.2 Measurement results ............................................................................... 23
6 Discussion ............................................................................................... 27
6.1 Analysis and discussion of results .......................................................... 27
6.2 Project method discussion ...................................................................... 27
6.2.1 Definition of the issue ............................................................................. 27
6.2.2 Study of literature .................................................................................... 28
6.2.3 Implementation ........................................................................................ 28
6.3 Scientific discussion ............................................................................... 28
6.4 Ethical and societal discussion ............................................................... 29
7 Conclusions ............................................................................................ 30
7.1 Answers of problem statement ............................................................... 30
7.2 Future Work ............................................................................................ 31
7.2.1 Text detection and recognition ................................................................ 31
7.2.2 Object detection and tracking .................................................................. 31

vii
Terminology
AAAL American Association for Applied Linguistics
AL Artificial Intelligence
ANN Artificial Neural Network
APL Application Programming Interface
ARM Acorn RISC Machine
CPU Central Processing Unit
DC Personal Computer
DL Deep Learning
DSP Digital Signal Processing
FPU Floating-point unit
GPU Graphics Processing Unit
IEEE Institute of Electrical and Electronics Engineers
IDL Integrated Development Environment
IOT Internet Of Things
MCU Microcontroller Unit
ML Machine Learning
NLP Natural Language Processing
NN Neural Network
SIMD Single Instruction Multiple Data
TF TensorFlow
TF Lite TensorFlow Lite
TPU Tensor Processing Unit

viii
1 Introduction
New technologies have historically had a huge influence on human evolution -
television, computers, the Internet, and, most recently, smartphones have invaded our
daily lives and living without them has become unfathomable. Utilizing machine
learning on the small devices and taking complex tasks from them is raising the AI
capabilities of the devices. This highlights the basic question of how Systems are used
and how much they are capable of [1].
The Internet of Things has greatly expanded the number of technical gadgets, but their
prominence in ordinary living has also offered a variety of applications in a wide range
of scenarios. Autonomous IoT devices solve this by incorporating AI into the devices.
Bringing logic, learning ability, context inference, and anticipation to IoT can
eventually reduce the strain on their operator. The implementation of AI on the gadgets
also addresses the basic privacy problem that has hampered the mainstream deployment
of cloud computing into each house. This helps us to largely address the question of
how AI systems should be used: they should enable well-being by permitting smart
gadgets that can fade into the background [2].
Machine Learning (ML) with TensorFlow Lite (TF lite) is a burgeoning field at the
intersection of embedded systems and Artificial Intelligence (AI). As such, a new range
of embedded applications are emerging for neural networks. Because these models are
extremely small (few hundred KBs), running on microcontrollers or digital signal
processor based embedded subsystems, they can operate continuously with minimal
impact on device battery life. Amazon, Apple, Google, and others use tiny neural
networks on billions of devices to run always-on inferences for keyword detection,
visual object detection, human-activity recognition, and anomaly detection [1]. The
work on the embedded devices is proposed by the authors to check the effectives on
them.

1.1 Background and problem motivation


Machine Learning (ML) is an AI application in which computers learn autonomously
from data. A statistical model is built using available facts, referred to as training data.
These engines can either generate predictions based on previously learnt quantitative
observations like collection and aggregation [3].
The AI systems and the algorithms applied in ML have changed the way we deal with
the detailed problems of the scientific and technological problems in most fields.
Significant advancement in the standard of new generation computer vision, natural
language processing, speech identification and a wide array of other processes can be
discerned. There is a limit to the number of accessible tools that are obtainable through
the market for the developers to install AI and ML on fixed devices [4].
Fixed platforms are characterized by their rigid limitations. There are restrictions to the
availability of advances in software development on these platforms because buying
resources is too expensive. There are several examples that are elemental to modern
programmers including dynamic memory management, an operating system, a standard
instruction set, a file system, floating-point hardware etc.

1
ML is still in the progress stage despite its exponential growth. There are experiments
going on carried out by the researchers involving new operations and network
architectures to obtain better results from their models. With an enhancement in results,
there is an increase in the demand of these models by the designers.
After reviewing the literature [2] following are the challenges that are being drawn:

• There are limitations in the platforms and sites which allow the ML integration
on embedded devices [4].
• The training tools are scarce in the ML market.
• Model deployment tools have limited access and lack of productivity makes
ML a challenging process for beginners.
• There is no proper framework for the compression and quantization of the ML
algorithm. Further, there is a lack of platform for the model invocation and
execution.

1.2 Overall aim


The aim of this thesis is:
• To explore the features provided by TF to integrate ML or DL models on
embedded devices.
• Hands-on practice of a model conversion will be done into TF Lite format,
model optimization and compression techniques provided by TF and its
implementation on embedded devices.

1.3 Problem statement/Research questions


The investigated problem statement of this thesis is:
Q1- What is the scope and application of Machine Learning in smartphones?
Q2- What are the challenges and limitations faced while deploying the Machine
Learning models on smartphones?
Q3- What is the brief procedure of deploying Machine Learning on a
Smartphone using TensorFlow Lite?
Q4- What are the effects on a Machine Learning model’s performance when
it's compressed and optimized for an application in a smartphone as compared
to the original model working on a standard computer or laptop?

2
1.4 Knowledge goals
Knowledge goals of this thesis are as following

• Procedure of Deployment of ML on embedded devices.


• Comparing the results and providing the effectiveness of the method in
smartphones.

1.5 Scope
In this thesis I am focusing on services provided by TF to make ML models applicable
on smartphones. I will explore the procedure of the machine learning model
compression and optimization using TF and monitor its effect on the model's
performance by comparing its Accuracy, need of resources, memory consumption and
response time.

1.6 Outline
In this thesis, I will introduce the tools provided by TensorFlow to integrate in machine
learning or deep learning algorithms in embedded devices. I will explore the options
given by Google’s leading open-source library to compress our model and evaluate the
performance of the compressed models.
The remainder of this thesis is organized as follows: Chapter 2 presents the prerequisite
theory including Machine learning, TensorFlow, TensorFlow Lite, Tensor, embedded
devices, transfer learning and some related research done on integrating machine
learning models on embedded devices. Chapter 3 presents the methods that will be
followed to make our model work smoothly on a n android smartphone. How I planned
for different modules of my thesis is discussed in this chapter. Chapter 4 presents hands-
on practice of ML model building, evaluating, compressing, optimization and its
integration on android smartphone using TensorFlow library. Chapter 5 presents the
findings of the measurements gathered during the implementation phase in a table and
bar charts for the sake of upcoming discussion and conclusion in next chapters. Chapter
6 highlights discussions about the results that are based on the analysis of the
implementation done in chapter 4. The discussion will address other aspects, e.g.,
discussion of project method, scientific discussion, and ethical and societal aspects of
ML integration on a smartphone. Chapter 6 presents the conclusion of my thesis: do I
achieve the required goal, answer the questions raised in knowledge goals and some
future works are recommended related to my thesis.

3
2 Theory
In this chapter, descriptions of various topics that are necessary to understand the
remainder of the paper are presented.

2.1 Machine Learning


ML is a technique in which a computer learns some abstract concept from data and
applies it to yet unseen situations. ML allows software applications to become more
accurate at predicting outcomes without being explicitly programmed to do so, here is
a slightly more general definition:

“Machine Learning is the field of study that gives computers the ability to learn without
being explicitly programmed.” —Arthur Samuel, 1959

And a more engineering-oriented one:

“A computer program is said to learn from experience E with respect to some task T
and some performance measure P, if its performance on T, as measured by P, improves
with experience E.” —Tom Mitchell, 1997

In general [3], ML is a process of optimizing a model. A model is a collection of


instructions, which aims to solve a given task. Being an algorithm, a model can
transform given input into desirable output with some level of efficiency. To increase
the efficiency of the model, its parameters need to be optimized. The optimization
process is called training and it is performed against relevant training data. The training
process runs in steps called epochs. After each epoch, the model can be validated to
keep track of the training process. The validation is done by providing the model with
data, which the model has not seen during the training. Analyzing the validation
outcome defines efficiency of the model [4]. The figure 1 given below shows an
artificial neural network with tree layers.

Figure 1: Neural Network

4
It is used to solve a wide range of problems not only in computer vision [5], but also in
other fields, such as data classification, knowledge extraction, or even speech
recognition [6].

2.2 Transfer Learning


In transfer learning, the knowledge learned from one model is transferred to a new
model. A pretrained model is set as the teacher and a similar version of that network as
the student. Being so, the training of our model is not focused on solving the problem
of having the original inputs and outputs, but rather on copying the operation of the
pretrained model [5, 6].
Transfer learning has been attracting more and more attention in both research and
industry for building machine learning models capable of solving problems using the
data from related problems [7].
The key idea of TL is that a pre-trained model on a massive amount of training data can
be reused to solve problems with limited data available. It also saves time and resources
[8].

2.2.1 TensorFlow
TensorFlow is the second-generation framework of Google Brain. On Feb 11, 2017,
edition 1.0.0 was published. TensorFlow, unlike the standard version, can run on many
Multi - Core CPUs. TensorFlow is compatible with 64-bit Linux, macOS, Vista, and
smart phones devices such as iOS and android [7].
Its adaptable design enables simple computing deployments over a wide diverse array
of substrates from PCs to hundreds of computers to smartphones and other devices.
Google revealed the Tensor processing unit in May 2016, an application-specific
microchip designed exclusively for deep learning and customized for TensorFlow [8].
A TPU is a configurable AI accelerator that is intended to use or execute models but
instead train them. Google said that they had been using TPUs in their data centers for
over a year and discovered that they provide an order of magnitude effective efficiency
per watt for deep learning. Google unveiled the second-generation TPUs in May 201,
as well as their availability in Google Compute Engine. Second-generation TPUs give
up to 180 teraflops of throughput and up to 11.5 petaflops when arranged into groups of
64 TPUs [9].
TF allows developers to create a dataflow structure that describes how data moves and
the which mathematically operations should be done on data during its transfer from
one point to another in the defined structure. In this structure data is presented as tensors
and the mathematical operations as nodes. The figure 2 given below shows both data
and its flow are presented [9].

5
Figure 2: TensorFlow overview

2.2.2 Tensor
TensorFlow allows you to create dataflow topologies with architectures to specify how
information goes through a tree by receiving data in the form of a series of layers called
Tensor. It enables you to create a flowchart of operations that may be done on all this
data, which passes with one extreme and returns as out at [10].

TensorFlow technology is divided into 3 areas:


• Data modeling
• Create the model.
• Model training and estimation
TensorFlow is named for the fact that it accepts input in the form of an array, popularly
called tensors. You may create a flowchart of the processes you wish to run on that input
(called a Graph). The data enters at one end, passes through this system of various
actions, and finally exits at either end as output.
Therefore, it is named TensorFlow: when a tensor enters, it travels through a series of
operations before exiting. The figure 3 given below shows Tensor’s dimensions in
different way [11].

Figure 3: Tensor types and dimension

6
In ML and DL data is represented numerically, in the form of tensor. A Tensor is a
container that can hold both vectors, matrix, and scalar data type, in other words, it can
be understood as a multidimensional array. TF deals with only tensors. The data is first
converted into tensors and then further processed. Even the Images are converted into a
table of pixel values, which are then treated as a tensor and then further processed.

2.2.3 Tensor Calculation


A tensor can be created either raw material or the outcome of a calculation. All
computations in TensorFlow take place within a structure. The graph is a series of
computations that occur one after the other. Each operation is referred to as an
operational node, and they are all linked together.
The graph shows the operations and relationships between both the components.
However, the data is not shown. The node's surface is the matrix, which is a mechanism
to feed the process with data [13]. The figure 4 [10], given below shows to differentiate
between vector, matrix, scalar, and tensor.

Figure 4: Difference between vector, matrix, scalar and tensor

2.2.4 TensorFlow Lite


TF Lite is a special feature and mainly designed for mobile, embedded and edge devices.
With the help of TensorFlow Lite we can convert existing models into an optimized
version within the sort of TF Lite file. TF provides different methods to optimize,
compress and convert a ML model into TF Lite format, some of them are discussed in
2.4. Most of the embedded device's support TF Lite models. It is the lighter version of
TF which supports most of the TF functions.
TF is currently being used in the world for different applications. It facilitates the
implementation of ml algorithms and conclusions for AI applications [14].
TF Lite, on the other hand, is a computational intelligence platform for local inference,
designed primarily for cheap computational hardware. It enables machine intelligence
on-device by supporting programmers in executing their models on relevant equipment
and IoT devices [14].
Apart from the latency and size advantages of TF Lite, the framework ensures data
security by training locally on the device. There is also no requirement for internet

7
access. As a result, new environments are not limited to certain places with connection.
TF Lite types are made in a bridge format file as Flat Buffers. It is a serialization toolkit
that saves data in such a flattened raw buffer, allowing for full access before packaging.
The "TensorFlow" name is also visible for the TF Lite model. This format approach
enables calculation optimization and minimizes space needs. As a result, it outperforms
TensorFlow models [14].

2.3 Model optimization with TensorFlow


The parts that follow will concentrate on optimization to solve the problems of putting
NN on low-power embedded systems.

2.3.1 Quantization
Quantization is used to reduce the number of bits needed to represent a variable,
maintaining the accuracy of the network. It is used both to avoid overfitting and to
reduce the size of the network. a model can be quantized when it’s converted to TF Lite
format using the TF Lite Converter [14].
According to research in the field of neuroscience, each synapse in the brain has a
memory space of 26 distinct synaptic qualities, this amounts to a respective value of 4.7
bits. Yet, because this is the unit capacity of modern computers, current NNs often rely
on decimals or double resolution floating figure values10. Because these number
symbols were never deliberately created for NNs, the challenge is whether we can
reduce the significance or build a system as a means optimized for NNs. Reduced
information value might have several advantages [15].
SIMD allows for more cost-effective and readily parallelized implementation of
functions with fewer and static point numbers. As a result, energy consumption may be
reduced while computing speed is increased [15].
Gaussian, Laplacian, and Gamma distributions are optimal symmetric uniform
quantizers. They were professionally researched. In contrast, the spacing between levels
varies with non-uniform quantizers. The functions given above are known as
quantifiable quantizers. Han et al., on the other hand, offered weight sharing
quantification [16]. K-means is employed in their concept to clump identical weights
for separate layers. However, this solution necessitates the use of a look-up table, which
increases memory usage and access. It is a form of quantization that cannot be
computed.
Using quantization in the context of NNs is useful for the reasons stated above. A next
paragraph will go through the NN's possible beneficial and detrimental consequences.
The figure 5 is given below shows the different type av quantization, Uniform
quantization, and non-Uniform quantization [17].

8
Figure 5: Quantization types

2.3.2 Post-training quantization


Post-training quantization is a conversion technique that can reduce model size while
improving CPU and hardware accelerator latency, with little degradation in model
accuracy. Several post-training quantization options are provided by TensorFlow. The
figure 6 [28], is given below shows the summary table of the choices and the benefits
provided by TensorFlow.

Figure 6: Quantization types of details

2.3.3 Pruning
Pruning removes those neurons that are lighter or less representative, improving the
generalization of the model and decreasing its size and computational costs. This
method was proposed back in 1990 [17], where the authors pruned a large ANN by
estimating the sensitivity of the error function for each connection and eliminating the
lowest ones.
Model pruning consists of removing (setting to 0 permanently) certain parameters. The
parameters that are already close to 0 are pruned. This stops a model from over fitting
since the parameters that were deemed useless during training cannot be reactivated
again. There are different ways to prune a model. Some random number of weights can
prune during the training, or a pre-trained model can be pruned as well to simplify a
model and make it lighter. The figure 7 is given below shows the different sparse
structure in a 4-dimensional weight tensor [18].

9
Figure 7: Types of sparsity from irregular to regular.

According to [21] Pruning can be done in the following was. The figure 8 is given below
shows illustration of the three stages complicated in the traditional pruning process [19].

Figure 8: Structure and working of the flow

Pruning is a strategy for introducing sparsity into NN. The trimming of irrelevant
weights inside NNs, first advocated by LeCunet al., can lead to several advancements
[19]:
• Improved generalization (reduce overfitting).
• Reduced memory footprint.
• Shorter inference delay.

Sparsity can and should be taken advantage of to reduce ram usage and reasoning delay.
As previously stated, pruning exploited to the memory footprint of a sparse matrix or,
more specifically, the pace of computation with a two-dimensional array. While filters

10
and route sparsity are simple to exploit since they depict a less packed structure, finer-
grained compactness requires unique processors and customized implementations-
whether in hardware or software [20].

Sparsity can be a characteristic of a NN, but it can also be observed in later stages of
ML on the edge:

• Sparsity of Space Due to the curse of dimensionality, input data is frequently


sparse.
• The exact knowledge is merely a small percentage of the supplied data.
• Sparsity in Time The signal's change over time is frequently sparse-just a small
amount of information is altered.
• Lack of Connectivity According to current research, most links inside NNs have
just a little contribution. They can be trimmed using the method described.
• Activation Sparsity Neurons are frequently not triggered during a forward pass
due to the nature of activations. As a result, non-activated neurons are not
required to be processed further in computation.

I will experiment each approach on any dataset provided by TensorFlow using


TensorFlow packages provided for pruning.

2.4 Embedded devices


Embedded device is a special-purpose computing system. The system, which is
completely built to meet the requirement of purpose, may or may not be able to connect
to the Internet. Embedded systems have large scale applications in consumer,
commercial, automotive, industrial, and healthcare industries. Because embedded
systems have limited computing resources and strict power requirements, writing
software for embedded devices is a very specialized field that requires knowledge of
both hardware components and programming. Embedded devices focused on this thesis
include smartphone android [21].

2.4.1 Micro controller


A microcontroller unit (MCU) is a small, embedded circuit that performs a specific
function within an embedded platform. A (CPU), memory, and an input/output system
are among the components. Because of its great energy efficiency, cheap cost, and easy
availability, MCUs are a popular component of embedded systems in IoT devices.
Notably, recent enhancements to MCUs that have expanded their use include software
graphics processing support (FPU), SIMD support and better energy conservation [22].
An analog-to-digital processor (DSP) is a type of microchip that deals with binary
codes. DSPs (digital signal processors) are employed in low-cost solutions and are
suited for efficient processing needs. A DSP's instruction set is distinct from that of a
CPU. SIMD commands are generally implemented across DSPs, allowing for the
concurrent processing of an operation, for example, so rather than computing a single
32-bit precision, one may, for example, conduct four 8-bit precision additions. DSPs

11
can be included into an MCU architecture. The ARM Cortex - M4 and M7 architectures,
for example, have a DSP [23].

2.4.2 Smartphones
Smartphones are the most used device nowadays; there numerous ongoing development
and improvement projects are in progress for this category of embedded devices. TF
supports both android and iOS platforms.AI in the form of ML or DL has a large scope
in both platforms. ML is used in NLP, computer vision, recommendation systems and
many other applications of smart phones [24].

2.4.3 Raspberry pi
Raspberry pi is a single-board computer which is mostly used by developers to build
hardware projects, do home automation, and Edge computing, in industrial applications
and in IOT. TF officially started supporting Raspberry Pi back in 2018 [24] with a
collaboration with Raspberry pi foundation. TF allows its users to integrate their ML or
DL models on Raspberry Pi with the help of TF lite. Due to the optimization and
compression done during the conversion of a ML or DL model into TF lite model by
TF lite converter it performs well on such a small device.

2.5 Environment
2.5.1 Google Colab
There are several IDE’s available to work with TensorFlow using the python
programming language. Google’s Colab is used for the thesis. Google Colab provides
both notebook and script interface with free access to GPU, CPU for ML developers.
TF official resources also provide a lot of helping material running on Colab. Colab also
provides several utilities, and pre-written functions to save developers time [25].

2.6 Evaluation dependent Parameters


The performance of the model will be determined in terms of the comparison of
following quantities as was done for both in Colab on computer and smartphone.

2.6.1 Accuracy of model


The accuracy of the model is an important factor as it will describe the model's
feasibility. Usually, accuracy is calculated by comparing the model's prediction with
truth values.

2.6.2 Time of prediction


The prediction time allows the understanding of the efficiency of the model both in
Colab on computer and smartphone.

12
2.6.3 Need of resources
The amount of space the model needs in the limited resource device; model
consumption power will also be under consideration for the work.

2.7 Related work


Machine learning has a huge scope in Smartphones, it has a demanding scope in
medical, education, entertainment, and Marketing in the form of computer vision,
Natural language processing, Recommendation system, and time series problem-
solving. Examples of Machine learning can be seen everywhere including
• Google lens
• Amazon
• Netflix
• YouTube
The following works were reviewed and studied using google scholar.

2.7.1 Design and optimization of a TF Lite on a smartphone


This work shows the scope of Machine Learning on a smartphone in the field of medical
and health care. It provides ease to both sides; the doctors can do better analysis for
better prescriptions and for the patient it is just a usual phone application monitoring his
health [26].
It is proven in the provided source that: “Quantization techniques are shown to reduce
the model’s weight representations to achieve a >30x model size reduction for improved
use on a smartphone. The result is an on-phone HAR model with accuracy of 92.7%
and a memory footprint of 27 KB.”
This work inspired me to explore ML integration on smartphones, as it can be used in
the invention of great applications like this which can be very helpful for the normal
person to get treated through his smartphone. A well-trained ML model with an
optimization and compression of TF Lite made both patient and doctor have better
results.

2.7.2 Deploying Image Deblurring Across Mobile Devices


The work describes the scope of Machine Learning on a smartphone in the field of
computer vision. As image enhancement and restoration have become important
applications on mobile devices, such as super-resolution and image deblurring.
However, most state-of-the-art networks present extremely high computational
complexity. The following source [27] provides comprehensive experiments and
comparisons to perform analysis for both latency and image quality.
This work inspired me to explore Machine Learning integration on smartphones, as it
can be used in the invention of great applications like this which can perform very well
with the restoration of pixels in various fields of the graphics industry. A well-trained
Machine Learning model with optimization and compression of TF Lite can make it
easy for developers to make a real-time version of their vision.

13
3 Methodology
This chapter describes the methods adopted during the planning of my thesis, the
workflow followed to achieve Implementation, evaluation, and conclusion of my
thesis.

3.1 Scientific method description


After giving a brief introduction of my thesis, I will start by indicating the problem
statement followed by the objective of my thesis and set the knowledge goals of my
thesis. To continue with my thesis and achieve the knowledge goals, it will be
necessary to adopt a research method for each of my knowledge goals, The procedure
for integration of TensorFlow lite to the embedded devices will start by understanding
the problem statement. As stated in figure 9 first the literature review was done and
then the gap was found. Knowledge goals were defined. Research method was
selected, and implementation was done. After implementation the comparison and
conclusion of all the obtained results was done.
.

Figure 9: scientific research method

The rigorous approach based on the scientific study was used to investigate the ML
Integration process. For a better grasp of my thesis's purpose, the literature was
examined to gain a fundamental understanding of Machine Learning, TensorFlow,

14
TensorFlow Lite, and other required topics. Following that, the suggested solutions
would be implemented through hands-on practice during the literature review.
A quantitative way to compare the performance of Machine Learning models in the
evaluation was considered. The objective is to collect data and compare measures to
arrive at conclusions. Measures will be taken at the start of this thesis' implementation
and compared to produce a conclusion. In the discussion and conclusion section, the
findings of my investigation will be shared.

3.2 Project method description


Summarizing the adopted methodology for my thesis, following five steps define the
workflow I will adopt for my thesis.

3.2.1 Problem definition


The problem definition provides the idea of the hindrance that needs to be removed to
achieve the goal which in my case is a hands-on practice of ML Integration on a
smartphone using TensorFlow. The problem definition also allows the understanding of
the Integration of our thesis objective. background and problem motivation are
described in 1.1, overall aim of my thesis in 1.2 and the questions raised in my mind
keeping the scope of my thesis in mind in Problem statement 1.3. It allowed me to
narrow down the topics. I need to focus on my literature study for a good hands-on
practice in implementation and a reliable conclusion.

3.2.2 Literature study


To continue my thesis, I was required to have a prerequisite knowledge related to the
thesis's objective. As one of the most popular browsers for finding research articles,
scientific literature, journals, and books is "Google scholar", This was done by searching
for relevant keywords; For example: Machine Learning, TensorFlow and TensorFlow
Lite, on Google scholar.
I derived the literature from the Google scholar to conceptualize, increase the
information and its relation to the objective of my thesis. The search focused primarily
on the scope of my thesis. The Hierarchy of literature is made to convey a better concept
of search regarding the objective of my thesis.

3.2.3 Implementation
After the literature study had been done, it was the time for implementation part of my
thesis. The implementation would be done using the TensorFlow library on Google's
Colab and the evaluation would be done by comparing the model’s performance-
dependent parameters in the Colab on a computer and an android smartphone. The
Machine Learning model would be first designed and trained using Keras, a neural
network building library provided within the TensorFlow library. Then the model would
be trained and evaluated. Once I had a trained model, I would perform the process of its
conversion, optimization, deployment on a smartphone, and evaluation.
Since TensorFlow lite is a tool provided by TensorFlow, Machine Learning model
conversion and optimization to work on a smartphone will be done using the

15
TensorFlow library and its provided tools. For example, the TensorFlow Lite converter
function, optimization function, and TensorFlow Lite interpreter. After that our
Machine Learning model would be in TensorFlow Lite format. The effect on model
evaluation parameters would be marked at this stage by running the TensorFlow Lite
model in the Colab on the computer for evaluation with the help of the TensorFlow Lite
interpreter provided by TF. The last step would be the TensorFlow Lite model's
deployment and evaluation on Android smartphones. The figure below provides the
steps performed in implementation.

3.2.4 Measurement setup


Measurement setup will consist of performance dependent parameters, “see chapter
2.6.” These measurements would be used for comparison and conclusion. The
conclusion depends on quantitative values measurement and comparison, I organize a
results table, where all the performance dependent parameter’s values would be entered
for later use in comparison and conclusion.

3.2.5 Evaluation
Evaluation will be performed at three stages in my implementation.
• ML model Built in Keras Evaluation on Colab.
• TFLite model evaluation on Colab.
• TFLite model evaluation on smartphone.
Measurement setup will let me mark the evaluated quantities in the table. Which then
further helps me compare and conclude.

3.3 Evaluation method


The evaluation method will be based on a comparison of the model’s performance
depending on parameters measured by adopting a measurement setup on three stages of
implementation discussed in the evaluation. The evaluation of these three stages would
be done on different testing data samples. The methods used during the implementation
of Machine Learning model conversion and optimization are discussed in 2.4 and
evaluated by comparing and concluding which method performs better. All the queries
raised in the problem statement would be clarified at the time of evaluation. After the
evaluation is done, a conclusion based on facts figured out during the literature study,
measurement setup, and Hands-on practice experience would be made. The challenges
and limitations faced during the progress of the thesis will be discussed in the discussion
chapter to defend the conclusion. The size, accuracy and response time will be
measured and The MNIST dataset of handwritten digits images will use.

16
4 Implementation
In this chapter I will do hands-on ML model building, evaluating, compressing,
optimization and its integration on an android smartphone using TensorFlow library.
The following workflow shows in figure 10, how to perform my implementation.

Figure 10: Implementation flowchart

4.1 Machine Learning model building using Keras


First, the required and helping libraries to start my implementation were imported. TF
library provides almost all the prewritten functions to perform ML model building,
analysis, evaluation, and data pre- and post-processing. NumPy library would be used
to convert data to tensors and vice versa. Matplotlib is also used to display images during
the implementation.
To build and train a model need some prepared dataset to train the model. The MNIST
dataset of handwritten digits images, which was available within the TF library and has
60000 images for training and 10000 images for validation and testing was chosen. Data
was loaded and preprocessed by using the following python commands in Colab. The
object of the desired dataset from the TF library was defined, the load data function was
called, and it returned me two tuples having training and testing images with their
labels.
As previously discussed TensorFlow only deals with a tensor form of data. The data
was prepared by following the instructions provided by the TF official. The pixel values
were normalized by dividing by 255 as it is the maximum value a pixel can have;
normalization allows us to save some computational power and time because we now
are dealing with a very small value than before.
I built the model using Keras, which is a neural network building library available within
TF. The figure 11 below shows python commands were used to make a basic sequential
model with six hidden layers, which include an input layer, 3 dense layers to train some
neurons and the last one is the output layer.

17
Figure 11: Base model designing

After designing the layers structure of my model, I defined the optimizer, loss, and
accuracy metrics for the model. Optimizer defines the rules for model improvement
based on loss and accuracy where the loss is defined as the difference between original
and prediction. If predictions deviate too much from actual results, loss function would
cough up a very large number. The accuracy matrix defines the relevance of prediction
with original value; if the prediction is very close to original value, the accuracy matrix
will give a value near a hundred percent. It was done by using the following python
commands.

Figure 12: model summary

The model summary was fetched to investigate its structure and parameters by using the
python command “see fig 12”. The model had 111,146 trainable parameters which
included neurons, weights, biases, optimization parameters, and more computational
parameters.

18
4.2 Machine Learning model training and evaluation
Training data was fitted with validation data, and several epochs to the model so it can
start training. In each epoch, it would train on training data, validate on validation data,
and calculate the accuracy and loss to perform optimization, this process was done
epoch times. Each epoch was reviewed by logging in the output during the training. It
showed details like the loss and accuracy of both training and validation. It helped to
monitor the performance of model training.
Evaluation of my model was done to compare the results later. The Python commands
that were used to evaluate the model are shown below in figure 13.
First, an image was selected from testing images provided in the MNIST dataset, then
the image was converted to a tensor of the required shape and datatype.
Displayed the image by using Matplotlib, after preprocessing it is fed to the model for
prediction.

Figure 13: base model testing output

4.3 TensorFlow lite Conversion and optimization


As the trained model was available, I can test the conversion and optimization of the
model using the TF Lite converter provided within the TF library. TF provided almost
every helping function to save developers time and give them a very user-friendly
experience.

19
4.3.1 TensorFlow Lite Conversion
TF Lite converter was used for the conversion of the Machine Learning model in TF
Lite format in just two steps.

1. Initializing the object of the TF Lite converter.


2. Calling the initialized object with a convert function.

The Hands-on practice of these two steps is shown in figure 14.

Figure 14: TF Lite conversion

4.3.2 TensorFlow Lite Converter optimization


First, standard dynamic quantization optimization was applied while converting the
model to a TF Lite model. I passed my trained model to the TF Lite-converter object.
After this conversion it was noticed that the size of the model was reduced to 23KB due
to standard dynamic quantization, it quantized the weights of the model to 8bit but the
input, output, and other hidden parameters in layers of the network are used float 32 bit.
The input and output data were verified type using python commands.
To quantize the input-output and hidden layers variables, only int quantization was used,
which quantifies the input, output, and hidden layer variables to uint8 bit. This method
of quantization requires a representative dataset. A representative data set provides a
small subset of a few randomly chosen samples from the original dataset. TF’s tf.data.
Dataset. the from-tensor-slices function was used to fetch a subset of the original
MNIST dataset.
The requirement of only int quantization was satisfied then it was applied to my model
while converting it to the TF Lite model. The following python command was used to
apply only int quantization optimization to my model. After applying only int
quantization on my model, its input and output datatype converted to uint8.

4.4 TensorFlow lite Evaluation on Colab


The model was loaded in the TF Lite interpreter and checked the input and output data
types using python commands. TensorFlow Lite interpreter is provided by TF to load,
run, analyze, and study a TF Lite model.
To run the model and check its performance, it needs to check the TF Lite model’s
input shape and data type. Then do the required changes in the input data, in my case
images so, the images were converted into desired input data type and shape before it's
fed to the model for prediction. This process is known as pre-processing, Machine
Learning developers must describe pre- and post-processing instructions with the model
and feed the input to the model for prediction as shown below in figure 15.

20
Figure 15: TF Lite model analysis.

The image form of given input is shown in figure 16 and TF Lite model predicted the
right number.

Figure 16: input image

21
4.5 TensorFlow Lite Model integration on Android smartphone
To integrate my model to an android application TF's official GitHub repository was
used in which a demo application is given to place your model and perform the
evaluation. The model was placed in the application and evaluated by running the
application on my smartphone. The results are shown in the snap below in figure 17

Figure 17: Android smartphone screenshot

22
5 Results
The findings of the measurements gathered during the implementation phase are
presented in this chapter. The outcome will be presented as a table, with a snapshot of
the evaluation cell's output in the Colab added; the Colab can be accessed from the link
in the appendix. Time measurements, Accuracy matrices results, and response time will
also be provided in the form of a grouped bar chart for comparison purposes.

5.1 Resulting system


For my thesis the resulting system has three cases, first is when the base model was
evaluated on Colab. The figure 18 given below shows the evaluated function provided
by TF and passed it the test images and test labels to evaluate my base model.

Figure 18: base model evaluation and response time measurement

In the second case, TF Lite models were evaluated, TF Lite models require the following
three steps to test run.

1- Initialize TF Lite interpreter


2- Fetch model details (Input, Output, Data type)
3- Run inference

The Helper function shown in figure 16 initializes the TF Lite interpreter, fetches model
details and then starts a loop to predict the images in test data one by one. Once the
prediction is done it is compared with actual results to define the accuracy of the model
as a parameter of performance dependence.
The third case is a test run of the model on a smartphone, I evaluate the model by its
performance in real life.

5.2 Measurement results


First, the basic model was evaluated. I designed and trained in Keras using TF, results
are shown in figure19.

Figure 19: output of base model evaluation

23
Base model took 1.3 seconds to predict, predicted 97% correct results and size of the
model is 1341.44 kbs. Then TF Lite model variants were evaluated and produced during
the implementation and the output is shown in figure 20.

Figure 20: output of TF Lite models evaluation

It shows the measurements of each model separately in figure 18. Differences in the
value of performance dependent parameters will be further discussed in the next chapter.
Then I evaluated the model on an android device by test running the model in real life
and it showed great performance. Figure 16 demonstrates its working on an android
smartphone.

Table 1 presents a bar chart of model’s response Time, which shows how much time
the model took to predict the value or result.

SIZE RESPONSE TIME ACCURACY


(KBS) (MS) (PERCENTAGE)

KERAS MODEL 1341.44 1.3207 97.60

TF LITE MODEL 436 0.2889 97.60

TF LITE MODEL 112 0.2814 97.53


WITH STANDARD
QUANTIZATION

TF LITE MODEL 112 0.6228 100


WITH ONLY INT
QUANTIZATION

Table 1: Measured size, Response time and accuracy of models.

24
Figure 21 presents a bar chart of models Inference Time, which shows how much time
the model took to predict the value or result.

Figure 21: Model Inference Time Bar chart.

Figure 22 presents a bar chart of model accuracy, which shows how many accurate
predictions were made by models in percentage.

Figure 22: Model accuracy Bar chart.

25
Figure 23 presents a bar chart of model size, which shows how much storage space,
computational power and memory a model took to predict the value or result.

Figure 23: Bar chart of storage space required by each model in terms of size

26
6 Discussion
This chapter highlights discussions about the results that are based on the analysis of
the implementation of my thesis. The discussion will address other aspects, e.g.,
discussion of project method, scientific discussion, and ethical and societal aspects of
ML integration on a smartphone.

6.1 Analysis and discussion of results


The Implementation of Machine Learning models on smartphones using TF and TF lite
showed great performance and provided a lot of ease during the process. Google is doing
its best in the performance, reliability, and improvement of TF.
The fluctuation of Measurements shown in table 1 proves that TF Lite is a reliable
option when it comes to Machine Learning integration on smartphones. The base model
has a size of 1342 KB which is reduced to 436 KB when it is converted to the TF Lite
model which highly affects the need for resources of the model, especially when it's
made for a limited resources device. On the other hand, converting the model does not
affect the accuracy of the model.
After applying standard quantization during the conversion of the model, it shows more
improvement in the compression and optimization of the model. Its size is reduced to
112kb with a negotiable effect on the model's accuracy and response time, which is
again a big achievement as it becomes 1/3rd of the previous model and yet it's performing
like the previous model, which shows the improvement in the model
The bar chart shows the clear drop in model size and response time while the accuracy
of the model is maintained. So, one can train any type of model in Keras or even use a
pre-trained model and implement his vision of Machine Learning in his application
easily using TensorFlow and its tools.
TF provided prewritten functions for data collection, data preprocessing, model
designing, transfer learning, model training, model evaluation, and model integration
on different devices and atmospheres. Another advantage of using TF was the
availability of guides and suggestions on official and non-official blogs.

6.2 Project method discussion


All the study milestones stated in Chapter 3.2 are discussed in depth in this chapter. The
debate is on whether the chosen method is appropriate and how well it was
implemented.

6.2.1 Definition of the issue


The strategy used to find and acquire information resulted in a broad and deep
comprehension of scientific principles in the conversion and optimization of a model
employing TF. This strategy assisted me in obtaining all the necessary background
information for my thesis goal. The work that was done in conjunction with it provided
a unique perspective.

27
6.2.2 Study of literature
The key task that this thesis would address was defined using this manner. The
characterization of the problem at the start of the project made it easier to grasp what
was needed and where to look for it.

6.2.3 Implementation
The method used to build the implementation is divided into 5 parts mentioned in
chapter 5. The first part was model structure designing, defining the layers, loss
function, activation function, optimizer, epochs, and other requirements to start the
model training. It was done by using Keras neural network building library, provided
within TF library. It was hard to design a model without having a strong background in
ML and Keras but TF blogs and support helped a lot during the process.
Data used for training is also available within the TF library. Which made it easier for
me to collect, label, clean, arrange and pre-process data for my thesis implementation
and study. In the next part I train the model on provided training data and evaluate on
testing data. Then I performed the standard TF Lite conversion and evaluated it. which
shows almost the same performance with a big drop in model size and response time.
In the next part I performed the TF Lite conversation with dynamic and only int
approaches discussed in theory for conversion and optimization of Keras model to TF
Lite model using TF Lite converter provided by TF. Only int is mase model converted
to TF Lite with only in quantization. Then I test-run the converted models by using TF
Lite interpreter, which showed impressive performance by maintaining the accuracy of
the model and doping the model size and response time. The model with 7x smaller size
is performing like the original model is a solid proof of TF user friendly and reliable
atmosphere.
I deployed the trained model to an Android application and tested it on an Android
smartphone, it predicted the handwritten word in real time. It can be used in application
with handwritten to text modules and can also work as Google lens does with text data.
All the difficulty was faced during the selection of approaches as there were many
available and the one with better background knowledge can choose better.

6.3 Scientific discussion


This section will focus on a scientific aspect of the thesis. The discussion will focus on
the research concerns that this study addresses, as well as the limitations of support and
grants. This research is about the process of ML model conversion and optimization for
smartphone integration. Algorithms developed during the thesis can predict any
handwritten word; Google Lens uses these kinds of well-trained models and performs
well out of the box.
Although the study focuses on the conversion and optimization of a machine learning
model, the methods described in 2.4 can also be employed with a pre-trained model.
During the entire procedure, TF was extremely user-friendly. When it comes to ML
model creation, training, conversion, and optimization, the evaluation metrics showed
that TF is the tool to work with. The method used in this study appears to be reasonable
in terms of meeting the study's overall goals. The system and its results show that they
are appropriate for use in industrial, Machine Learning research, and other real-world

28
applications. One can use the approaches mentioned and practiced in my thesis to
integrate ML on embedded devices, including an android smartphone.

6.4 Ethical and societal discussion


In the industrial sector, Deploying Machine learning effects in calming the work
environment, upgrading modern technology, and increasing product quality. Because
different sorts of components connected to the same network regulate the components
connected within the ProfNet network, this can lead to a better society. However, as
machines and modern technology take over, this may reduce the need for workers in the
labor market. As a result, job opportunities for humans are reducing day by day as
machines are being preferred over humans.
Smartphones are another example of the evolution of technologies. Smartphones are
currently enabled with the technologies like machine learning and AI to perform the
processes in photography, design, and other departments. This thesis can provide a
link for the industries to manifest Machine Learning in a capable way in smartphones.

29
7 Conclusions
After the fruitful outcome of my thesis, I arrive at the following conclusion.
I conclude that the scope of ML is growing very rapidly in the devices with limited
resources specially smartphones as they are the most used devices nowadays.
TensorFlow and TensorFlow lite are good tools to use while integrating ML on
Smartphones. As the measurements taken during the implementation of my thesis were
in favor of TF and TF lite.
TensorFlow lite optimizes and compresses the model in a way that its size and response
time are dropped without affecting the accuracy of the model. Which is the main
challenge to integrate ML on devices with limited resources and unlimited scope of ML
in their applications.
TF provided the facility of data loading, data labeling, data preprocessing, model
designing, model building, model evaluation, model compression and support for
almost every platform in industry under one roof. I don't need any other tool regarding
ML while using TensorFlow.
TensorFlow Lite user support is also extraordinary then its comparative tool provider
in the market. Researchers and developers are preferring TensorFlow Lite for ML
application in devices with limited resources, due to its extremely facilitating
environment.

7.1 Answers of problem statement


Q1 - What is the scope and application of Machine Learning in smartphones?
Machine Learning exists in all functions of smartphones ranging from checking the
health to the wake-up and sleeping time and ensuring user-friendly modes in those
certain times. These applications of Machine Learning in smartphones define a great
scope of ML in smartphones. And is increasing day by day in all the categories of
Machine Learning models: Natural language processing, Computer vision, and time
series prediction.

Q2 - What are the challenges and limitations faced while deploying the Machine
Learning models on smartphones?
The main challenge was the limitation of resources on a smartphone as compared to the
Colab. The challenge was to compress and optimize the base model in a way that doesn't
affect its accuracy. It was hard to provide the accuracy like the base model after this
much compression and optimization.

Q3- What is the brief procedure of deploying Machine Learning on a Smartphone


using TensorFlow Lite?
Deployment of Machine Learning on a smartphone first requires data collection and
labeling when someone is doing a custom project. Then the model structure is designed,
and the model is trained for a custom project or transfer learning approach. Model is
converted into flite with standard or custom optimization and compression techniques.
The TF Lite converted, compressed, and optimized model is then placed in the regarding
device to do its work.

30
Q4 - What are the effects on a Machine Learning model’s performance when it's
compressed and optimized for an application in a smartphone as compared to the
original model working on a standard computer or laptop?
The size of the base model was 1342 KB which is reduced to 112 KB when it is
converted to the TensorFlow Lite model which highly affects the need for resources of
the model. The accuracy remains the same despite the change in the size of the model.
The different image material may change the accuracy.

7.2 Future Work


Someone can consider the following implementations of Machine Learning on
smartphones in future.

7.2.1 Text detection and recognition


Working on alphabet and word detection and recognition using a good data set and the
process adopted during my thesis to implement and test run the trained model on both
android and iOS devices.
It will be done by training a well-structured and developed ML model on TF using
Keras neural network building library and once done with a base model, you can follow
the methodology of my thesis to achieve the required model. The acquired model would
be deployed on a smartphone to test run the model in real life as it was done in my
thesis.

7.2.2 Object detection and tracking


Working with more than one model to solve the problem statement of object detection
and tracking. In which one model will perform object detection and the other for object
tracking.
The first model will be used to detect the object in the frame and when one is detected
another model will be used to detect the location of the detected object in the frame to
draw a boundary box around the object for tracking purposes. Two well trained models
would be acquired to solve the problem statement and perform the optimization and
compression of them separately.
It will explore the performance of TF Lite models when they are working as a team to
solve any problem statement.

31
References
1. David, R., et al., TensorFlow lite micro: Embedded machine learning for tinyml
systems. Proceedings of Machine Learning and Systems, 2021. 3: p. 800-811.
2. Demosthenous, G. and V. Vassiliades, Continual learning on the edge with
tensorflow lite. arXiv preprint arXiv:2105.01946, 2021.
3. David, R., et al., TensorFlow lite micro: Embedded machine learning on tinyml
systems. arXiv 2020. arXiv preprint arXiv:2010.08678.
4. Warden, P. and D. Situnayake, Tinyml: Machine learning with tensorflow lite on
arduino and ultra-low-power microcontrollers. 2019: O'Reilly Media.
5. Louis, M.S., et al. Towards deep learning using tensorflow lite on risc-v. in Third
Workshop on Computer Architecture Research with RISC-V (CARRV). 2019.
6. Tom Mitchell, 1997
7. Alpaydin, E., Introduction to Machine Learning. Third edit. 2014, MIT Press.
8. Gorospe, J., et al., A Generalization Performance Study Using Deep Learning
Networks in Embedded Systems. Sensors, 2021. 21(4): p. 1031.
9. Alsing, O., Mobile object detection using tensorflow lite and transfer learning.
2018.
10. Ma, Y., et al., Transfer learning for cross-company software defect prediction.
Information and Software Technology, 2012. 54(3): p. 248-256.
11. Tang, J., Intelligent Mobile Projects with TensorFlow: Build 10+ Artificial
Intelligence Apps Using TensorFlow Mobile and Lite for IOS, Android, and
Raspberry Pi. 2018: Packt Publishing Ltd.
12. Anjum, S., et al. A pull-reporting approach for floor opening detection using
deep-learning on embedded devices. in ISARC. Proceedings of the International
Symposium on Automation and Robotics in Construction. 2021. IAARC
Publications.
13. Osman, A., et al. TinyML Platforms Benchmarking. in International Conference
on Applications in Electronics Pervading Industry, Environment and Society.
2022. Springer.
14. riley, o.: p. p#1,2,5,6.
15. Zhang, J., et al. dabnn: A super-fast inference framework for binary neural
networks on arm devices. in Proceedings of the 27th ACM international
conference on multimedia. 2019.
16. Frajberg, D., et al. Accelerating deep learning inference on mobile systems. in
International Conference on AI and Mobile Services. 2019. Springer.
17. HardienJ, Deep Learning Book Series · 2.1 Scalars Vectors Matrices and Tensors.
2018.
18. Gorospe Jauregui, J., A Generalization Performance Study Using Deep Learning
Networks in Embedded Systems. 2021.
19. Tensor.2021;Availablefrom:
https://www.tensorflow.org/lite/performance/model_optimization.
20. Pearce, H., et al., Designing Neural Networks for Real-Time Systems. IEEE
Embedded Systems Letters, 2020. 13(3): p. 94-97.
21. Wang, Y., et al. Pruning from scratch. in Proceedings of the AAAI Conference on
Artificial Intelligence. 2020.
22. Martinez-Alpiste, I., et al., Smartphone-based real-time object recognition
architecture for portable and constrained systems. Journal of Real-Time Image
Processing, 2022. 19(1): p. 103-115.

32
23. Yang, X., et al. A compositional approach using Keras for neural networks in
real-time systems. in 2020 Design, Automation & Test in Europe Conference &
Exhibition (DATE). 2020. IEEE.
24. Gao, Y. and K.M. Mosalam, Deep transfer learning for image‐based structural
damage recognition. Computer‐Aided Civil and Infrastructure Engineering,
2018. 33(9): p. 748-768.
25. Khekare, G. and K. Solanki, REAL TIME OBJECT DETECTION WITH SPEECH
RECOGNITION USING TENSORFLOW LITE.
26. Salah Eddin Adi Department of Electrical and Electronic Engineering Design
and optimization of a TensorFlow Lite deep learning neural network for human
activityrecognition on a smartphone 202.
https://ieeexplore.ieee.org/abstract/document/9629549
27. Deploying Image Deblurring Across Mobile Devices: A Perspective of Quality
and Latency Cheng-Ming Chiang, Yu Tseng, Yu-Syuan Xu, Hsien-Kai Kuo, Yi-
Min Tsai, Guan-Yu Chen, Koan-Sin Tan, Wei-Ting Wang, Yu-Chieh Lin, Shou-
Yao Roy Tseng, Wei-Shiang Lin, Chia-Lin Yu, BY Shen, Kloze Kao, Chia-Ming
Cheng, Hung-Jen Chen; Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 502-503
28. TensorFlow “ Post-training quantization”
https://www.tensorflow.org/lite/performance/post_training_quantization

33
A TF Lite models initialized, defined and
evaluated

34
B Model training

35

You might also like