Channel - a generic communication facility

Yigong Liu (2/6/2006)

INTRODUCTION: channel - shared namespace

PART I. Channel Dynamic Behaviour And Compositions

2. Channel namespace management I - pub/sub scope
3. Channel namespace management II - Channel connection
4. Channel namespace management III - interface, translator, filters
5. Channel is reflective/self-descriptive/self-revealing

PART II. Channel Static/Configuartion Compositions

6. Polymorphic/Generic Channels I: Id_Type and Id_Traits
7. Polymorphic/Generic Channels II: Routers
8. Polymorphic/Generic Channels III: Synch_Strategy, Platform_wrapper
9. Msg(Payload)Traits (Alloc/Free, Marshaling)

PART III. Applications

10. Use cases
11. Application 1: single-threaded event dispatcher
12. Application 2: CVM (Channel Virtual Machine)
13. Application 3: MsgBoard - Simple msg-persistence / tuple-space style application

1. Introduction: channel - shared namespace

A channel is a communication namespace shared by its members or peers which are normally threads or callbacks. Members/peers communicate thru channels by publishing/subscribing messages in this shared namespace. When a peer sends a msg to the channel, all peers which subscribe to this message will receive it. So there are 4 basic channel communication operations:

Channel.publish (Msg_Id);
Channel.subscribe (Msg_Id);
Channel.send (Msg_Id, payload_data);
Channel.recv (Msg_Id, payload_data);

A channel's namespace consists of all the msg_ids published in this channel. Depending on the types of MsgIds and routing algorithms, we can have 3 types of namespaces:

linear: such as using integers as Ids and hash for routing
hierarchical: such as using pathnames for Ids
associative: such as using Linda's associative lookup

A channel is a process "local" namespace shared by threads/callbacks inside the process; the core of channel is a light-weight, simple registration/routing table data structure indexed by msg_ids.

Channels can connect to remote channels to form "merged" namespaces.

The design and implementation is highly modular and reconfigurable:

separate the system into several "orthorgonal" aspects, either dynamically or statically
allow mix and match them freely to create diff types of channels for diff applications.

Dynamically, Channels can configured at the following aspects:

namespace management, pub/sub scope control
remote channel connections
channel msgs/Ids translators and filters
msgs memory management and marshaling

Statically, Channels can be configured at the following aspects:

Id types and Id-traits
Router types and routing algorithms
Synchronization strategies

we'll discuss these aspects in detail.

PART I. Channel Dynamic Behaviour And Compositions

TBD: redo Connectors as "operations" over channel namespaces

2. Channel namespace management I - pub/sub scope:

Because a single channel is a process local namespace, and a channel can be connected to and merged with remote channels/namespaces; Similar to the lexical scoping in programming languages and the concepts of local variables and global variables; There are the following definitions of publish/subscribe scopes:

local peers: communication peers inside the same channel
remote peers: communication peers from different channels
SCOPE_LOCAL: publish/send specified messages to local peers subscribe/recv specified messages from local peers
SCOPE_REMOTE: publish/send specified messages to remote peers subscribe/recv specified messages from remote peers
SCOPE_GLOBAL: publish/send specified messages to both local and remote peers subscribe/recv specified messages from both local and remote peers

Scopes are specified when message/ids are published or subscribed:

Channel.publish (Msg_Id, SCOPE_LOCAL);
Channel.subscribe (Msg_Id, SCOPE_GLOBAL);

3. Channel namespace management II - Channel connection:

3.1 export interface of a channel.

The interface of a function is :

input: input arguments and
output: output arguments and return values.

Similarly, a channel's interface to outside world is:

input: globally subscribed msgs/ids
output: globally published msgs/ids

3.2 namespace merge during channel connection

When 2 channels (A and B) are connected, their namespaces are updated/merged in the following ways:

msgs flowing from B->A: the intersection of A's input msgs set (global subscriptions) and B's output msgs set (global publications)
msgs flowing from A->B: the intersection of B's input msgs set (global subscriptions) and A's output msgs set (global publications)
newly published or subscribed msgs/ids are automatically propogated to connected channels. So peers in channel A can communicate with peers in channel B transparently the same way as with the local peers.

When channels are disconnected (either intentionally or caused by network failures), channels' namespaces will be updated automatically (remotely published/subscribed ids will be removed).

3.3 Connectors

Channels themselves are lightweight thread-safe process local data structures, providing namespace management and routing supports. Channels have NO internal threads.
Using connectors, channels can be announced at "addresses" and connected to remote channels at other addresses. Connectors have internal threads to manage remote connections.

local connector : Channels inside the same process address space can be connected thru local connectors.

Channel *chan1 = new Channel();
Channel *chan2 = new Channel();
LocalConnector::connect (chan1, chan2);

remote connectors: Channels at different processes and machines can be connected thru remote connectors.

Based on different transport mechanism, there are the following types of connectors:
<1> unix sock connector
<2> tcp sock connector
<3> shared memory connector
<4> SOAP connector
<5> Transient Connector (One new connection per message transfer)
Channel *chan = new Channel();
TcpSockConnector *conn = new TcpSockConnector(chan);
conn->open("192.168.254.11:8888"/* chan_addr */); //start listening for incoming conn at chan_address
conn->connect("192.168.254.112:6666"/* remote chan addr */);

internal interface between Channel and Connectors. How to add new connectors.

4. Channel namespace management III - interface, translator, filters

4.1 Interface

When channels are connected to other channels, the endpoint of connection, where channel and connection binds is called "interface", the channel's openning to the remote end. At interfaces, we can install translators and filters for futher namespace management.

4.2 Translators

similar to

remote filesystem mount operation
NAT (address translation)

To support integration of namespaces, we can use namespace translators to translate msgs/ids; so we can relocate the remote namespace content (MsgIds) from one connection to a specific subspace in the local namespace. Translators are applied to user/application msgs; so it will be called a lot and have to be very efficient:

input translator: translate incoming msgs
output translator: reverse translate outgoing msgs in the same connection

For example,

with linear namespace, the translator can add a delta/offset to incoming msgs ids from one connection to move the msgs/ids to a specific section of namespace.
with hierarchical namespace, the translator can prefix a "mountpoint path_name" to incoming msgs/ids, so that the msgs from this connection will be located in a subtree of the whole namespace, similar to "mounting" a remote file system to a local mount point.

4.3 Filters

similar to

file-sys read/write/exec permissions
firewalls

Filters are applied to subscribe/publication msgs, not to application/user msgs. Filters are used to control :

which msgs the remote end can subscribe and recv from local channel
which msgs the remote end can publish and send to the local channel

5. Channel is reflective/self-descriptive/self-revealing:

members can subscribe to the following 8 system msgs to receive notifications when channel configuration changes:

CHANNEL_CONN_MSG: a remote channel connect in.
CHANNEL_DISCONN_MSG: a remote channel disconnect.
INIT_SUBSCRIPTION_INFO_MSG: initial subscription info exchange.
INIT_PUBLICATION_INFO_MSG: initial publication info exchange.
SUBSCRIPTION_INFO_MSG: a subscriber subscribes to msgs.
UNSUBSCRIPTION_INFO_MSG: a subscriber unsubscribes msgs.
PUBLICATION_INFO_MSG: a publisher publishes msgs.
UNPUBLICATION_INFO_MSG: a publisher unpublishes msgs.

Members can subscribe to these msgs and react appropriately. for example:

learn about when remote channels connect in or remote peers join in so start communication
learn about channel disconnect (network failures?) and handle these cases gracefully. Handle them as normal cases (normal messages) not an exceptions.

PART II. Channel Static/Configuartion Compositions

Channel static compositions are supported with C++ template. Each of the following "aspects" is a parameter class for Channel template:

template
<
    class Id_Type,
    class Id_Trait = IdTrait<Id_Type>,
    class SynchPolicy = ACE_MT_SYNCH,
    class AllocPolicy,
    class Router = MapRouter<Id_Type, Id_Trait, SynchPolicy, AllocPolicy>
>
class Channel;

All these "parameter" classes implement some well-known interfaces required by the framework. By applying diff classes to this template, we can instantiate diff channel types for various application context.

6. Polymorphic/Generic Channels I: Id_Type and Id_Traits

Msg Ids and Msg types are synonyms for the "key" information which channels use to route the msgs, and members/peers use to subscribe or publish msgs.
Channels know the content of msg Ids, not msg payload, which is opaque. Msg payload can be NULL, then everything that pass thru channels are Ids, it becomes content based routing.
Diff applications and diff routing algorithms require diff Id types:

embedded systems normally use integer Msg Ids for minimal space and computation overhead.
e-commerce applications could use strings as msg Ids for easy display/debugging
hierarchical namespace supporting group routing could use a unix-pathname style Ids to represent tree structures (D-BUS)
Linda style associative routing may use multi-field structure as Ids

Ids could be a piece of continuous memory region or structs with pointers to non-continuous memory space.
To support routing computations, Id types need to provide several operations and attributes which is wrapped in a trait class - Id_Trait:

key operator functions for routing calc: <, ==, !=
idToString(): return a string representation for debugging
size(): size of memory space taken by Id
marshal/demarshal functions
memory management functions: alloc, free

Also Id_Traits class contains the definitions of system msgs Ids/types:
1> channel management msgs:

CHANNEL_CONN_MSG:
CHANNEL_DISCONN_MSG:
INIT_SUBSCRIPTION_INFO_MSG:
INIT_PUBLICATION_INFO_MSG:
SUBSCRIPTION_INFO_MSG:
UNSUBSCRIPTION_INFO_MSG:
PUBLICATION_INFO_MSG:
UNPUBLICATION_INFO_MSG:

2> wildcard msgs:

WILDCARD_MSG.

7. Polymorphic/Generic Channels II: Routers

Depending on the namespace organization (linear,hierarchical), msg routing requirements (group broadcast support, associative lookup, broadcast...), performance/resource requirements (embedded systems, commerce applications...), there are many routing algorithms developed, which can be classified into the following categories:
1> hash(table/stl map) routers:

mostly used in embedded systems, traditional messaging systems
namespace are linear/flat (simple)
normally use simple Id_Type: integer, string
using hash-table/map as routing data structure, using exact id-matching as routing algorithms

2> hierarchical namespace routers:

used in applications needing complicated hierarchical namespace
sample: D-BUS in linux
could use unix filesys_pathname style Id_types to represent tree-structure
data structures?
support group broadcasting: using partial pathname (prefix) matching for routing

3> associative routers:

used in Linda style tuplespace
normally use complicate Id_type containing multi field; could use the msg itself as Id.
associative lookup: pattern matching of Id fields for routing.

4> Broadcast|RoundRobin
As the core of channel, Routers have to support the following key methods for namespace management and msg routing:

publish_msg(IdType t, PubSub_Scope s, Source *src)
unpublish_msg(IdType t, Source *src)
subscribe_msg(IdType t, Destination *s)
unsubscribe_msg(IdType t, Destination *s)
route_msg(Msg *msg, Member_Type src_type, PubSub_Scope scope, ACE_Time_Value *timeout=0)

8. Polymorphic/Generic Channels III: Synch_Strategy, Platform_wrapper

9. Msg(Payload)Traits (Alloc/Free, Marshaling)

opaque payload
. need to add "wildcard" Ids for routing or registrations of memory manager or marshaler. (wildcard - the default?)

PART III. Applications

10. Use cases:

single thread event dispatcher
multi-threaded message passing system
distributed message passing system
With payload=NULL, content based routing

11. Application 1: single-threaded event dispatcher

1> an event dispatching system for GUI:

GUI primitives: line/circle/rect/...
events_types: Button_Down_On/Button_Up_On/Selected/Moved
callbacks: to be called when events happen on chosen primitives

2> a channel for GUI events:

typedef struct {

event_type type;
primitive_type enum {line/circle/rect/...};
GUI_primitive *prim; //or char *prim_name; for remote msgs?

} GUI_Channel_Id;

each GUI primitive will publish its own events types
event_handlers/event_listeners/callbacks will register for specific events on specific primitives; or register for specfic events on specific primitive types; or register for specfic events on all primitives.
these callbacks will be only invoked when these events happen on these primitives.

3> completely independent from existing class hierarchy and objects containment hierarchy. can be added un-intrusively
4> similar to Java's event handling framework, support dynamic event bindings;

advantages:
event dispatching framework independent from object hierarchies
support asynchronous event handling and remote handling directly and transparently.

12. Application 2

Cvm (channel virtual machine):
a small dynamically reconfigurable framework based on ACE service_config, task frameworks and channels

13. Application 3

MsgBoard:
Simple msg-persistence / tuple-space style application; using Berkeley-DB as Storage