DC Lecture 1
DC Lecture 1
1. https://www.w3schools.com/cs/index.php
2. https://docs.microsoft.com/en-us/visualstudio/
windows/?view=vs-2022
3. Check the demo code attached with the lecture
in the BB
4. Watch the Demo in ILecture
What even is a distributed
computing anyway?
●
Processors are fast! (but can only go so fast)
●
More processors == More fast!
●
Concurrent computing == Great! (ignoring the
headache of synchronising stuff)
●
Problem: I only have 1 processor (in 1990’s
anyway)
●
Solution: Use other PC’s processors!
And why would anyone care?
●
Distributed number crunching is great and all,
but you know what else is cool? The Internet.
●
People all have computers, but those are far
away from my database/application.
●
Why not distribute parts of my application to
those computers?
●
And this is basically how everything works now.
To Summarize
●
Distributed computing is splitting applications
into multiple processes on multiple machines.
●
Requires a lot of inter-process communication
(IPC)
●
Is not a simple Client-Server architecture!
– The application must be doing some useful work in
multiple places.
– However that’s pretty much how all Client-Server
apps work these days anyway, so it is confusing.
Client, Server, Peer to Peer
●
Client-Server application:
– Dumb client accesses smart server, does all work up there.
– Example: SSH, FTP, plain HTTP.
●
Peer to Peer application:
– No servers, clients all talk amongst eachother, however no
coordinated, orchestrated job is being done.
– Example: Bittorrent, Mesh Wifi networks.
●
Distributed Application:
– Can be Peer to Peer (Cryptocurrencies), can be Client-Server
(MMORPGs).
– Coordinated, singular application being run across all member
computers.
Are Browsers Distributed?
●
Browsers themselves? No.
– Basic HTML pages aren’t either.
●
What runs on them is these days.
– Online Documents
– Webmail Clients
– Online Games
– Web Crypto Miners (although you may wish they
weren’t)
So you want to distribute your app
●
You’ll need an IPC method.
●
Needs to support networks
– Although quite a few of these predate modern
networking!
●
You might just think to use TCP/UDP sockets.
●
This is a less than great idea.
But I like the TCP/IP stack!
●
TCP/IP is too general purpose
– Sure you can send packets, but every application
needs you to re-invent the wheel and build a
protocol on top of TCP/IP!
●
What you want is a protocol that hides all the
messiness of networking
– Ideally so the programmer can’t see the distribution.
– Pro: You don’t need special skills/design principles
to be a distributed programmer!
Remote Procedure Calls
●
Programs are made of function calls.
●
So…. Why not have a function call that actually happens
somewhere else?
– Maybe in another process, maybe on another computer
– It should look the same as a regular one
●
Very old concept, original RFC 707 was in 1976!
– Xerox tried it first (because of course they did)
– Sun’s ONC RPC system was the first popular one.
– Many other systems exist.
– Most people just use the web these days.
Creating an RPC Function
●
Define functions that can be used remotely
– This defines a “server interface”
●
Ie: int getPinNo( IN int customerID )
●
“Compile” the definition using RPC utility
– This generates source code for your application
●
A stub for the client side that does remote comms to the
server
●
A server side function that handles serverside comms,
and an empty “implementation” function for you to fill out.
Creating an RPC Function
●
The client function acts as a proxy, taking
whatever data you feed it (including parameters
and calls) and sends it to the server side.
●
The server side then assembles the transmitted
data and uses your function to perform the
operation.
●
The result is then sent back over to the client
side and returned to whatever called the stub
via normal methods.
Creating an RPC Function
●
Fill in the function for the server side
– As if you were just implementing any other function
●
Add calls to the client side function
– As if it really existed on the client side
●
Compile both client and server code
– This comprises your code, the generated code, and the
libraries used by your RPC infrastructure.
– Library code does all the messy network bit.
●
Finally, distribute and run!
RPC – The Messy Part
●
At run time, calls to your distributed function are
redirected on the client side to the RPC
libraries.
●
The following slides will give you some idea of
what happens next.
●
We’ll be sticking to a communication over a
network, but this theoretically could also be
done via OS message passing (in the case of
two local processes).
RPC – The Messy Part
●
Client side:
– Client proxy makes call ready for network transfer
– This process is called marshaling (not to be
confused with serialization)
●
Information including function signature and parameters
are bundled into a single byte array.
●
This is then sent across the network in some fashion as a
message.
– The client proxy then blocks until the server
responds or an error occurs.
RPC – The Messy Part
●
Server Side:
– RPC infrastructure receives IPC request on network
socket
– Examines function signature to determine
destination
– Unmarshals the function info, and calls the server
function.
●
This is where your code gets run
– On return, return value and output parameters are
marshaled and sent back to the client.
RPC – The Messy Part
●
Client 2 Electric boogaloo:
– Client RPC infrastructure receives response from
server
– Unmarshals result and output parameters
– Passes these values back to calling function as per
normal
– Program continues as usual.
RPC – The Messy Part
What is Serialization?
●
We’ll get back to this later in the semester, but
in a nutshell:
– Marshaling converts the signatures and parameters
of functions into a byte array for RPC
– Serialization converts the data and signatures of an
object into some kind of byte format.
●
This is not just for distribution! Can be used for disk
storage, object comparison, hashing, etc.
– RPC Parameters may be marshaled by serializing
the contents, but nobody serializes by marshaling.
Message passing
●
You don’t need to marshal to serialize
– This can lead to some very data-centric network
distribution models.
– Send messages of data rather than calling remote
functions.
– There are specific alternative protocols for this:
●
MPI (Message Passing Interface), used in Parallel
Computing
●
Message Queues, used in other data centric applications
– These may or may not be actual distributed systems.
Messaging vs RPC
●
RPC is all about hiding the network activity as
function calls
– Highly function-oriented
●
Message passing involves clients and servers
sending messages to each other.
– Data-oriented
– Application specific (may be data, may be instructions,
may return, may not)
●
Modern web based systems are stateless, and so
result in very blurred lines between the two.
RPC Pros and Cons
●
Pros:
– It’s just like calling a local function
●
Messaging requires you building a lot of network code, not
so with RPC systems
●
Cons:
– Remote calls have weird problems:
●
Much slower due to networks
●
Can fail due to networks
●
You have no idea what happened or where you got up
to… due to networks
RPC Cons continued
●
Worst of all, the system looks just like local
function calls.
– This means programmers can have a hard time
understanding that things are distributed.
– This leads to incomprehensible bugs caused by
networking errors (how can you reference memory
over a network?).
– This is crippling for projects that are highly
distributed (for example, FOSS).
Let’s introduce some other concepts
you’re going to need
Components
●
RPC is a great IPC mechanism….
●
But it’s only a fix for a problem, namely how to
communicate.
●
The real question is, what do you distribute?
– Where are you going to put what parts, and why?
●
Components are the defacto solution to this
problem
What are components?
●
A component is a set of functions that work
together.
●
It is also the smallest chunk of an application
that can be moved to a remote server.
●
Essentially provides “a service” to other
components.
●
This is vaguely related to the concept of
prefabrication.
Components and Objects
●
Components are defined by the interface it
exposes to external clients
●
This is very similar to how objects can be
defined by their interface to other objects.
●
This makes OO style programming
unreasonably effective in building components.
●
This is one of the main reasons we’re using C#.
OO Components
●
Components are the interface for external
clients, and link with the application dynamically
via the operating system.
●
Objects are the internal implementation of the
component systems, and are linked statically by
the compiler.
●
These objects can then have further interfaces
to allow internal object systems to interact with
the external clients via the OO interfaces.
Components, Objects, and RESTing
●
These two concepts are very similar:
– Component: Service-oriented, provides methods for
clients
– Object: Data-oriented, methods transform it’s own state.
●
Components represent the services a server can
perform, objects represent things in a simulated
system.
●
This is blurred significantly by representational state
transfer services (services that produce and
consume objects).
Components, Objects, and RESTing
●
Example:
– Broker.DoTransaction() is a service and so likely a
function in a component.
– Transaction.SetBroker() is a mutation and so likely
a function in an object.
– Both of these are valid actions in a modern REST
based web service!
●
Depending on how you’ve built it of course.
Focusing on Objects and
Components
●
They both:
– Define public interfaces for clients
– Can inherit interfaces from other
objects/components
– Encapsulate data (and can provide information
hiding)
– Define a cohesive set of actions and variables that
can fulfil a role in the application
– Are reusable (theoretically).
Differences
●
Objects ●
Components
– Technically specific – Architecturally specific
– Are actually source – Independant of actual
code in the app. code
– Internal, only seen by – Exposed to other
other objects applications
– Compile time linking – Use run time linking
– Implementation is – Implementation
specific and reusable. details are obtuse
Components and RPC
●
Put these together and you have the basis for
modern distributed computing.
– Components are easy to think about, Objects
naturally snap on to their interfaces.
●
However this introduces many problems
– First among these; The network is not a stable
medium like RAM.
– What are the consequences of spreading state over
a network?
Networking: Making Lan Parties
Hard Since 1999.
●
When you’re locally based, errors occur
because you did something wrong.
●
When you’re on the network, things go wrong
because of the network.
– Network fails, no program for you
– Server fails, no program for you
– Local configuration fails, no program for you
●
Big problem is figuring out which of these
happened, and when.
*Nix vs Windows vs OS2
●
Another major (extremely major) problem is that
of varying architectures.
●
A Linux program and a Windows program are
not binary compatible.
●
Tools are very often walled in to an ecosystem
●
Unearthly hackery often resulted.
The Service Layer
●
The current way that operating systems work is
via something called Service Oriented
Architectures.
●
These have become so pervasive that all
operating systems (that matter) currently have
Services inbuilt
– Sorry Unix fanboys, Systemd is actually a result of
this.
What is an SOA?
●
Services are components that clients connect to
for execution of a task
– For example, the printing service, or the login
service.
●
They cannot be passed around, and are instead
resident (clients have to go to them).
●
This changes the mentality of designing a
distributed system (and a system altogether).
Why?
●
Well, have you ever tried to pass a pointer to
something physically in memory on another
machine?
●
What if the objects that comprise your object
are on another machine, and that machine
cannot be accessed?
●
Service structures prevent distribution by
chunking off sets of objects by defining distinct
siloed behaviours via groups of components.
The Evolution of Services
●
Services are great
– Lower coupling over networks
– Less implicit trust between components
– Improved security due to less exploitable pathways
●
Services are about 20 years old now.
– Are fundamental to operating systems
– But have brought the Web into our computers
– Object orientation has suffered
●
Objects are now second class citizens
●
Good OO design is now actively eschewed
Next Week
●
The History of RPC / Component Frameworks.
– Meet such exciting characters as COM, DCOM,
CORBA, and XML SOAP.
– Learn about the great conflict between JavaEE
and .NET
– Discover why Web Services have totally dominated
all other RPC methods (although you can probably
guess this one)