Channel - a generic
communication facility
Yigong Liu (2/6/2006)
Table of Contents
- INTRODUCTION: channel - shared namespace
PART I. Channel Dynamic Behaviour And
Compositions
- 2. Channel namespace management I - pub/sub scope
- 3. Channel namespace management II - Channel
connection
- 4. Channel namespace management III - interface,
translator, filters
- 5. Channel is
reflective/self-descriptive/self-revealing
- PART II. Channel Static/Configuartion
Compositions
- 6. Polymorphic/Generic Channels I: Id_Type and
Id_Traits
- 7. Polymorphic/Generic Channels II: Routers
- 8. Polymorphic/Generic Channels III:
Synch_Strategy, Platform_wrapper
- 9. Msg(Payload)Traits (Alloc/Free, Marshaling)
- PART III. Applications
- 10. Use cases
- 11. Application 1: single-threaded event
dispatcher
- 12. Application 2: CVM (Channel Virtual Machine)
- 13. Application 3: MsgBoard - Simple
msg-persistence / tuple-space style application
A channel is a communication namespace shared by its members or
peers which are normally threads or callbacks. Members/peers
communicate thru channels by publishing/subscribing messages in this
shared namespace. When a peer sends a msg to the channel, all peers
which subscribe to this message will receive it. So there are 4 basic
channel communication operations:
- Channel.publish (Msg_Id);
- Channel.subscribe (Msg_Id);
- Channel.send (Msg_Id, payload_data);
- Channel.recv (Msg_Id, payload_data);
A channel's namespace consists of all the msg_ids published in
this channel. Depending on the types of MsgIds and routing algorithms,
we can have 3 types of namespaces:
- linear: such as using integers as Ids and hash for routing
- hierarchical: such as using pathnames for Ids
- associative: such as using Linda's associative lookup
A channel is a process "local" namespace shared by threads/callbacks
inside the process; the core of channel is a light-weight, simple
registration/routing table data structure indexed by msg_ids.
Channels can connect to remote channels to form "merged"
namespaces.
The design and implementation is highly modular and reconfigurable:
- separate the system into several "orthorgonal" aspects, either
dynamically or statically
- allow mix and match them freely to create diff types of
channels for diff applications.
Dynamically, Channels can configured at the following aspects:
- namespace management, pub/sub scope control
- remote channel connections
- channel msgs/Ids translators and filters
- msgs memory management and marshaling
Statically, Channels can be configured at the following
aspects:
- Id types and Id-traits
- Router types and routing algorithms
- Synchronization strategies
we'll discuss these aspects in detail.
TBD: redo Connectors as "operations" over channel namespaces
2.
Channel namespace management I - pub/sub scope:
Because a single channel is a process local namespace, and a channel
can be connected to and merged with remote channels/namespaces; Similar
to the lexical scoping in programming languages and the concepts of
local variables and global variables; There are the following
definitions of publish/subscribe scopes:
- local peers: communication peers inside the same channel
- remote peers: communication peers from different channels
- SCOPE_LOCAL: publish/send specified messages to local peers
subscribe/recv specified messages from local peers
- SCOPE_REMOTE: publish/send specified messages to remote peers
subscribe/recv specified messages from remote peers
- SCOPE_GLOBAL: publish/send specified messages to both local and
remote peers subscribe/recv specified messages from both local and
remote peers
Scopes are specified when message/ids are published or subscribed:
- Channel.publish (Msg_Id, SCOPE_LOCAL);
- Channel.subscribe (Msg_Id, SCOPE_GLOBAL);
3.
Channel namespace management II - Channel connection:
3.1 export interface
of a channel.
The interface of a function is :
- input: input arguments and
- output: output arguments and return values.
Similarly, a channel's interface to outside world is:
- input: globally subscribed msgs/ids
- output: globally published msgs/ids
3.2 namespace merge
during channel connection
When 2 channels (A and B) are connected, their namespaces are
updated/merged in the following ways:
- msgs flowing from B->A: the intersection of A's input
msgs set (global subscriptions) and B's output msgs set (global
publications)
- msgs flowing from A->B: the intersection of B's input
msgs set (global subscriptions) and A's output msgs set (global
publications)
- newly published or subscribed msgs/ids are automatically
propogated to connected channels. So peers in channel A can communicate
with peers in channel B transparently the same way as with the local
peers.
When channels are disconnected (either intentionally or caused by
network failures), channels' namespaces will be updated automatically
(remotely published/subscribed ids will be removed).
3.3 Connectors
Channels themselves are lightweight thread-safe process local data
structures, providing namespace management and routing supports.
Channels have NO internal threads.
Using connectors, channels can be announced at "addresses" and
connected to remote channels at other addresses. Connectors have
internal threads to manage remote connections.
- local connector : Channels inside the same process address
space can be connected thru local connectors.
Minimal overhead - basically pointer passing, no internal
threads/queuing
Channel *chan1 = new Channel();
Channel *chan2 = new Channel();
LocalConnector::connect (chan1, chan2);
- remote connectors: Channels at different processes and machines
can be connected thru remote connectors.
Overhead - connection setup, internal threads, msg queuing,
marshaling/demarshaling
Based on different transport mechanism,
there are the following types of connectors:
<1> unix sock connector
<2> tcp sock connector
<3> shared memory connector
<4> SOAP connector
<5> Transient Connector (One new connection per message transfer)
Channel *chan = new Channel();
TcpSockConnector *conn = new TcpSockConnector(chan);
conn->open("192.168.254.11:8888"/* chan_addr */); //start listening
for incoming conn at chan_address
conn->connect("192.168.254.112:6666"/* remote chan addr */);
- internal interface between Channel and Connectors. How to add new
connectors.
4.
Channel namespace management III - interface, translator, filters
4.1 Interface
When channels are connected to other channels, the endpoint of
connection, where channel and connection binds is called "interface",
the channel's openning to the remote end. At interfaces, we can install
translators and filters for futher namespace management.
4.2 Translators
similar to
- remote filesystem mount operation
- NAT (address translation)
To support integration of namespaces, we can use namespace translators
to translate msgs/ids; so we can relocate the remote namespace content
(MsgIds) from one connection to a specific subspace in the local
namespace. Translators are applied to user/application msgs; so it will
be called a lot and have to be very efficient:
- input translator: translate incoming msgs
- output translator: reverse translate outgoing msgs in the same
connection
For example,
- with linear namespace, the translator can add a delta/offset to
incoming msgs ids from one connection to move the msgs/ids to a
specific section of namespace.
- with hierarchical namespace, the translator can prefix a
"mountpoint path_name" to incoming msgs/ids, so that the msgs from this
connection will be located in a subtree of the whole namespace, similar
to "mounting" a remote file system to a local mount point.
4.3 Filters
similar to
- file-sys read/write/exec permissions
- firewalls
Filters are applied to subscribe/publication msgs, not to
application/user msgs. Filters are used to control :
- which msgs the remote end can subscribe and recv from local
channel
- which msgs the remote end can publish and send to the local
channel
5. Channel is
reflective/self-descriptive/self-revealing:
members can subscribe to the following 8 system msgs to receive
notifications when channel configuration changes:
- CHANNEL_CONN_MSG: a remote channel connect in.
- CHANNEL_DISCONN_MSG: a remote channel disconnect.
- INIT_SUBSCRIPTION_INFO_MSG: initial subscription info exchange.
- INIT_PUBLICATION_INFO_MSG: initial publication info exchange.
- SUBSCRIPTION_INFO_MSG: a subscriber subscribes to msgs.
- UNSUBSCRIPTION_INFO_MSG: a subscriber unsubscribes msgs.
- PUBLICATION_INFO_MSG: a publisher publishes msgs.
- UNPUBLICATION_INFO_MSG: a publisher unpublishes msgs.
Members can subscribe to these msgs and react appropriately. for
example:
- learn about when remote channels connect in or remote peers join
in so start communication
- learn about channel disconnect (network failures?) and handle
these cases gracefully. Handle them as normal cases (normal messages)
not an exceptions.
PART II. Channel
Static/Configuartion Compositions
Channel static compositions are supported with C++ template. Each of
the following "aspects" is a parameter class for Channel template:
template
<
class
Id_Type,
class
Id_Trait = IdTrait<Id_Type>,
class SynchPolicy =
ACE_MT_SYNCH,
class AllocPolicy,
class Router = MapRouter<Id_Type, Id_Trait,
SynchPolicy, AllocPolicy>
>
class Channel;
All these "parameter"
classes implement some well-known interfaces
required by the framework. By applying diff classes to this template,
we can instantiate diff channel types for various application context.
6.
Polymorphic/Generic Channels I: Id_Type and Id_Traits
Msg Ids and
Msg types are synonyms for the "key" information which channels use to
route the msgs, and members/peers use to subscribe or publish msgs.
Channels know the content of msg Ids, not msg payload, which is opaque.
Msg payload can be NULL, then everything that pass thru channels are
Ids, it becomes content based routing.
Diff applications and diff
routing algorithms require diff Id types:
- embedded systems normally
use integer Msg Ids for minimal space and computation overhead.
- e-commerce applications could
use strings as msg Ids for easy
display/debugging
- hierarchical namespace
supporting group routing
could use a unix-pathname style Ids to represent tree structures
(D-BUS)
- Linda style associative
routing may use multi-field structure
as Ids
Ids could be a piece of continuous
memory region or structs with
pointers to non-continuous memory space.
To support routing
computations, Id types need to provide several operations and
attributes which is wrapped in a trait class - Id_Trait:
- key operator
functions for routing calc: <, ==, !=
- idToString(): return a
string representation for debugging
- size(): size of memory space
taken by Id
- marshal/demarshal functions
- memory management
functions: alloc, free
Also Id_Traits class contains the
definitions of
system msgs Ids/types:
1> channel
management msgs:
CHANNEL_CONN_MSG:
CHANNEL_DISCONN_MSG:
INIT_SUBSCRIPTION_INFO_MSG:
INIT_PUBLICATION_INFO_MSG:
SUBSCRIPTION_INFO_MSG:
UNSUBSCRIPTION_INFO_MSG:
PUBLICATION_INFO_MSG:
UNPUBLICATION_INFO_MSG:
2> wildcard msgs:
WILDCARD_MSG.
7.
Polymorphic/Generic Channels II:
Routers
Depending on the namespace
organization (linear,hierarchical),
msg routing requirements (group broadcast support, associative lookup,
broadcast...), performance/resource requirements (embedded systems,
commerce applications...), there are many routing algorithms developed,
which can be classified into the following categories:
1>
hash(table/stl map) routers:
- mostly used in embedded
systems,
traditional messaging systems
- namespace are linear/flat
(simple)
- normally use simple Id_Type:
integer, string
- using hash-table/map as
routing data structure, using exact id-matching as routing algorithms
2> hierarchical namespace
routers:
- used in applications needing
complicated hierarchical namespace
- sample: D-BUS in linux
- could use
unix filesys_pathname style Id_types to represent tree-structure
- data
structures?
- support group broadcasting:
using partial pathname
(prefix) matching for routing
3> associative routers:
- used in
Linda style tuplespace
- normally use complicate
Id_type containing
multi field; could use the msg itself as Id.
- associative lookup:
pattern matching of Id fields for routing.
4> Broadcast|RoundRobin
As the core of channel, Routers have to support the following key
methods for namespace management and msg routing:
publish_msg(IdType t,
PubSub_Scope s, Source *src)
unpublish_msg(IdType t,
Source *src)
subscribe_msg(IdType t,
Destination *s)
unsubscribe_msg(IdType t,
Destination *s)
route_msg(Msg *msg,
Member_Type src_type, PubSub_Scope
scope, ACE_Time_Value *timeout=0)
opaque payload
. need to
add "wildcard" Ids for routing or registrations of memory manager or
marshaler. (wildcard - the default?)
10. Use
cases:
- single thread event dispatcher
- multi-threaded message passing
system
- distributed message passing
system
- With payload=NULL, content
based routing
11.
Application 1: single-threaded event dispatcher
1> an event dispatching system
for GUI:
GUI
primitives: line/circle/rect/...
events_types:
Button_Down_On/Button_Up_On/Selected/Moved
callbacks: to be called when
events happen on chosen primitives
2> a channel for GUI events:
typedef struct {
event_type type;
primitive_type enum
{line/circle/rect/...};
GUI_primitive *prim; //or
char *prim_name; for
remote msgs?
} GUI_Channel_Id;
each GUI primitive will publish its own
events types
event_handlers/event_listeners/callbacks
will register for
specific events on specific primitives; or register for specfic events
on specific primitive types; or register for specfic events on all
primitives.
these callbacks will be only
invoked when these events
happen on these primitives.
3> completely independent from
existing
class hierarchy and objects containment hierarchy. can be added
un-intrusively
4> similar to Java's event handling framework,
support dynamic event bindings;
advantages:
event dispatching framework
independent from object hierarchies
support asynchronous event
handling
and remote handling directly and transparently.
12.
Application 2
Cvm (channel virtual machine):
a small dynamically reconfigurable framework
based on ACE service_config, task frameworks and channels
13.
Application 3
MsgBoard:
Simple msg-persistence / tuple-space style
application; using Berkeley-DB as Storage