CVM - Channel Virtual Machine
A Small C++ Framework For Dynamically
Reconfigurable Distributed Applications
Yigong Liu (2/14/2006)
Table Of Content
1. Design Rationale
1.1. Dynamic reconfigurability
1.2. ACE's facilities
1.3. Channel's role
2. Base Framework
2.1. Common Runtime / Executable
2.1. Framework Classes
2.3. Instantiation Classes
3. CVM Application Development
3.1. CVM Instantiation
3.2. Application Parition, Messaging Interface and Code
Development
3.3. Application Configuration
4. CVM system composition and
configuration
5. Dynamic system reconfiguration and
Control
1. Design Rationale
There are various methodologies for
designing multithreaded and distributed applications. One which is
especially attractive to the author is the methodology of Bell Lab's
Plan9 and Inferno distributed OS and their programming style of
CSP[][]. Basically explicit message passing (or channel
communication)
style design are strongly encouraged over shared
memory model.
A CVM application is partitioned into
co-operating tasks (or threads), running inside the same process,
different processes or different machines. The prefered interaction interfaces
among CVM tasks are explicit message passing; there should be no shared state among CVM tasks
(though CVM doesn't prohibit it) and no
task should change another task's state "from underground" (without
going thru messaging). This is just a design preference, whose benefits
are discussed below. CVM doesn't prohibit any shared memory design.
1.1. Dynamic
reconfigurability - Separation
of System Design and System Configuration
Modern distributed applications
oftern require dynamic reconfigurability:
- the ability to partition applications into distinct modules and
designate / relocate modules to different processes or machines at
runtime based on machines' load and status.
- the ability to dynamically load and link a new version of modules
without shuting down the application for the purpose of software
maintainance and upgrade.
Dynamic reconfigurability means the separation of
system design and system configuration / deployment:
- Design: centered around tasks and messaging, which tasks are
created for what
application functionality, what are the messages used by these tasks to
communicate.
- Configuration / Deployment: resource assignment and allocations,
which tasks are running in which processes on which machines, what are
the communication connections among these processes and machines.
Configuration must satisfy the requirements of design, such as which
pair of tasks should be able to communicate with each other. Since the
same system design can run in various system configurations (the
extreme case is that all tasks run in the same address space), the
framework is named as "virtual machine".
Dynamic reconfigurability means that configuration
issues are removed from design and development phase, delayed to
deployment phase or
runtime.
CVM achieves dynamic
reconfigurability based on 2 design choices:
- Application functionalities are partitioned into CVM tasks. Each
task is built into a shared library (or DLL) which can be dynamically
loaded and linked into application process. So Task is both the unit of
software development and the unit of application configuration and
deployment. The same set of application tasks can be reconfigured
without code change into diff processes
or diff machines for the
reason of load balancing or debugging.
- Explicit message passing is the prefered / sole interface among
application tasks. There are no "hard wiring" among tasks such as
pointers, references, function calls or even shared memory regions. So
application tasks can be freely distributed to different "hosting"
processes or machines and keep functioning provided that proper
communication (message passing) connections are set up among hosting
processes.
CVM is built on top of facilities provided by ACE
and Channel, which are further discussed below.
1.2. ACE's facilities
ACE (Adaptive Communication Environment) is a
powerful and portable OO/C++ framework for system programming. It
provides not only wrapper facade classes to abstract the complete OS
facilities, but also frameworks and design patterns for developing
multithreaded and distributed applications, some of which are Reactor,
Service Configurator, Task and Acceptor-Connector.
CVM achieves its dynamic reconfigurability by using
the following 2 ACE frameworks:
Tasks provide features for
thread-spawning, thread-pool-management, messages_que for
application
message passing
- Service Configurator Framework
Configurator allows an
application to link/unlink its components at run-time without having to
recompile, statically relink, and restart the entire application
1.3. Channel's role
Channel is a C++ template framework for
message passing and event dispatching. Channel's major components are
statically configurable as template parameters, such as message id
type, synchronization strategies (MT-safe or not) and routing
algorithms. Channels in different processes and machines can be
connected to allow transparent distributed message passing.
CVM uses dynamically loadable instances of channels
to provide communication among tasks.
2. Base Framework
CVM base framework consists of three base classes
and one common runtime/executable.
2.1. Common Runtime / Executable
cvm
- the common runtime is a simple executable, with only a main()
function inside which ACE_Service_Configurator is opened.
cvm
is a empty shell, without any application logic. Its sole purpose is to
instantiate an application process and provide the hosting environmet
into which application modules in the form of shared library (or DLL)
can be configured and loaded. The configuration and loading of
application modules are controlled by ACE configuration specifications,
which will be further described in Section 4 and 5.
2.2. Framework Classes
The framework classes are the
classes of application modules which can be configured and loaded into cvm. There are three framework
classes: channels, connectors, and tasks. Among them, channels and
connectors are configured to set up the communication framework of
application processes; and tasks are threads hooked into the
communication framework and carry out application logic. Task is the
sole class application code should inherit and extend:
template <class Channel>
class CvmBaseChannel : public CvmService
CvmBaseChannel is a simple wrapper over normal channel objects
providing the required interface for ACE_Service_Configurator, so that
channels can be dynamically loaded and linked into cvm. It is also templated by the
channel type (message id types, routing algorithms, ...).
template <class Channel, class
Transport>
class CvmBaseConnector : public CvmService
CvmBaseConnector are wrappers over normal channel connectors to provide
the required interface for ACE_Service_Configurator, so that connectors
can be dynamically loaded into cvm.
It is templated by the channel type and transport type (Tcp socket,
Unix domain socket...).
template <class Channel>
class CvmBaseTask : public ACE_Task<ACE_MT_SYNCH>
CvmBaseTask is the base class of all applicatin tasks, which has the
following three methods related to tasks' life cycle. ACE Task
framework will call these three and other default life-cycle methods to
init, finish, open and close application tasks. Mostly
application code should inherit this class and implement the three
methods to add application logic:
virtual channel::Status prepare(void) //initialization before thread
starts
virtual channel::Status cleanup(void) //cleanup before threads exit
virtual int work (void) //main processing
loop
There could be one or more threads running inside a task and the number
is configurable.
TBD: need to add dynamic loadable wrappers for filters and
translators and bind them to connector
2.3. Instantiation Classes
CVM base classes are templated by
Channel type. Based on different application requirements and tradeoff,
proper message id types, routing algorithms and transport can be chosen
to instantiate these classes. Further discusions are in the following
section.
3. CVM Application Development
CVM application development consists of 3 aspects:
configuring the static properties of channels, implementing application
tasks, and deciding the configuration and deployment of application
modules to different processes and runtimes.
3.1. CVM Instantiation (Static Configuration)
CVM uses channels for inter-tasks
communications and Channel is a template class with message id type,
synchronization strategy and routing algorithms as template parameters.
The first step is to choose proper template arguments and obtain a
valid CVM instantiation.
Different application domains have different
requirements and tradeoff for performance, space, and maintainance. In
embedded systems, normally integer ids, table based routing are used
for efficiency. In desktop or servers applications, message ids may be
more meaningful: string names, family/types, hierarchical structures to
allow group based communication.
Inside Channel distribution package (.tar.gz file),
three basic message id types and their trait classes are defined for
practical use and demonstration purpose: integer, string, and simple
POD struct. If new id type is necessary, its trait class should also be
defined to provide the required operations and methods for routing
calculations.
Each CVM instantiation will create specialized
version of base classes: CvmChannel | CvmTcpConnector |
CvmUnixConnector | CvmTask, all of which are contained in a single
shared library libCvm.so (or libCvm.dll).
Each CVM instantiation could be tested at first with
sample applications (such as ping/pong).
3.2. Application Partition, Messaging Interface
And Tasks
Development
The development of
CVM based applications involves the following steps:
- partition application
functionalities into distinct application tasks
- define the
messaging interfaces among application tasks. Based on CVM
instantiation, message id and data structure should be defined for
each application message.
- create application tasks by inheriting from CvmTask and
implementing task life cycle methods:
- virtual channel::Status prepare(void)
called by the framework for
initialization before thread starts, the following initialization can
be done here:
. subscribe to message types
. open file or database, load config data, restore application context,
etc
- virtual channel::Status cleanup(void)
called by the framework for cleanup
before threads exit,
the following cleanup can be done here:
. unsubscribe messages
. save application context
. close files or database connections
main processing loop, the
framework will use it as the main function of threads
3.3. Application Configuration (Dynamic
Configuration)
During initial application development, we can use
few processes in the same machine as the testing platform and assign
application tasks to these processes. Later near to completion and
delivery, we can starting testing in
configurations involving multiple machines which is more close to the
real deployment configuration in the field.
For more details on configuration, please read the
next 2 sections.
4. CVM system composition and configuration
ACE_Service_Configurator provides a simple
configuration language (in the format of either plain text or xml
format) to provisions parameters for dynamically loaded
application modules (shared libraries or DLLs). On top of ACE config
language, CVM configuration files use the following conventions for
definitions:
- each entity (CvmChannel | CvmTcpConnector | CvmUnixConnector |
Application_Tasks derived from CvmTask) is uniquely named.
- the "binding" relationship among entities are defined by
referring to the names of previously defined entities.
- the framework classes provide methods to find objects by names.
The following is a sameple configuration file for a
hosting process containing 2 channels, 2 connectors and 2 tasks:
dynamic Channel1 Service_Object *
Cvm:_make_CvmChannel()
dynamic Channel2
Service_Object * Cvm:_make_CvmChannel()
the above 2 directives create 2 channel
instances (named Channel1, Channel2) inside the hosting process.
dynamic Connector1 Service_Object *
Cvm:_make_CvmUnixConnector() "-c Channel1 -u lsock_ping -r lsock_pong1"
the above directive creates an unix
domain socket connector (named Connector1), bind it to Channel1, start
listening for incoming connections at "lsock_ping" (i.e. other channels
can connect to Channel1 at lsock_ping) and connect Channel1 to remote
channel at address "lsock_pong"
dynamic Connector2 Service_Object *
Cvm:_make_CvmTcpConnector() "-c Channel2 -p 52345"
the above directive creates a tcp
socket connector (named
Connector2), bind it to Channel2, start listening for incoming
connections at tcp port 52345 (i.e. other channels can connect to
Channel2 at port 52345)
dynamic Ping_Task Service_Object *
PingTask:_make_Ping_Task() "-c Channel1"
the above directive creates an
application task (named Ping_Task) sending/receiving messages thru
Channel1.
dynamic Demo_Task Service_Object * DemoTask:_make_Demo_Task() "-c
Channel2"
the above directive creates an application task (named Demo_Task)
sending/receiving messages thru Channel2.
TBD: need to define the configuration syntax for dynamic loadable
wrappers of filters and translators and bind them to connectors
5. Dynamic system reconfiguration and Control
ACE_Service_Configurator framework has
a service object ACE_Service_Manager which can be enabled in each
process, which can be configured (in config file) to listen at a
specific port (e.g. 9411) for incoming service management commands.
These commands can be use to control application tasks without shuting
down the process. The following commands are supported:
- "dynamic" : dynamically load and start an application task
- "remove" : remove an application task
- "suspend" : suspend a task without remove it
- "resume" : resume a suspended task
To relocate an application task from one process to
another process, there are the following issues involved:
- the first issue is how to replicate/maintain application context
data during the move.
- the moved task should save its context data before it is stopped
and removed in the original process, normally it is done inside
cleanup() method.
- the moved task should restore its context data during its startup
process in the new hosting process, normally it is done inside
prepare() method.
- the 2nd issue is how threads and software module related to the
moved application task are handled. The following procedures are
normally followed:
- "suspend" command is sent to the original process to suspend the
task; all of its threads will be paused and application processing stop.
- "remove" command is sent to the original process to remove the
task; all of its threads will exit and its code (shared library/DLL)
will be unloaded.
- "dynamic" command is sent to the new process to dynamically load
this task and start it.