Name-spaces for Message Passing and
Name-spaces Connection Protocol Specification
Version 0.1
Yigong Liu
Updated 10/15/2009
Status of this
Document
This is a draft
Copyright Notice
Copyright (C) 2005-2009 Yigong Liu. All Rights Reserved. Licensed Under
Creative Commons Attribution 3.0.
Abstract
Channel framework
provides
communication name-spaces shared by loosely coupled peers
(threads or callbacks) for asynchronous / distributed message passing.
Peers (message senders and receivers) bind to each other thru names in
name
spaces and exchange messages.
Distributed channels can be connected or "mounted" to allow transparent
distributed message passing. Connected channels create a decentralized
peer to peer communication systems.
Based on Channel's
design, this document further explores the details of name-space
connection / merge protocol for connecting distributed channels. The
discussions here focus on generic interactions and is based on Gerard J. Holzmann's
protocol model. So no details related to any specific transport
layer are presented. The author has implemented the protocol on top of
both tcp connections and shared memory queues.
Table of Contents
- Requirements
notation
- Introduction
to
Name-space Terminology
- Names
(in
name-space)
- Types
of
name-space:
- Binding
- Interface
- Elements
of Protocol
Specification
- Protocol
Service
Overview
- Assumptions
About
Environment
- Protocol
Vocabulary
- Message
Format
- Protocol
Operations
(Procedure Rules)
- Channel
connection
- Channel
disconnection
- Publication
of
names
- Unpublication
of
names
- Subscription
of
names
- Unsubscription
of
names
- References
Requirements notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Introduction to
Name-space Terminology
The following introduces the basic concepts of message passing
name-spaces. For more details please refer to section
4.1 of Channel's design doc.
To use channel for message passing, senders and receivers bind to names
in name space; binding and matching rules decide which senders will
bind to which receivers; then message passing and event dispatching
could happen among bound senders and receivers.
-
Names (in
name-space)
A name has the following attributes:
- Id is the main content of names. Integers, strings, POD structs
etc can be used as ids for linear name space; string path names can be
used for hierarchical name spaces and regex patterns and Linda style
tuples can be used for associative name space.
- Id_trait defines the id-matching algorithm which partially
decides name-matching and which senders will bind which receivers.
A channel is a process local name space
which can be connected to other local or remote channels. So we have 2
types of communicating peers:
- MEMBER_LOCAL (local peers): communication peers inside the same
channel
- MEMBER_REMOTE (remote peers): communication peers from different
channels
- sending and receiving scope
When sending/receiving messages, we can
specify the scope of operations:
- SCOPE_LOCAL:
- publish/send specified messages to local peers;
- subscribe/receive specified messages from local peers
- SCOPE_REMOTE:
- publish/send specified messages to remote peers;
- subscribe/receive specified messages from remote peers
- SCOPE_GLOBAL:
- publish/send specified messages to both local and remote peers;
- subscribe/receive specified messages from both local and remote
peers
-
Types of
name-space:
There are 3 types of name spaces based
on its id-matching algorithms
and naming structures:
There are ordering relationship among
ids, so they can be arranged in linear range. Exact matching
is used
for
id-matching.
There are containment relationship
among ids, so they can be arranged in tree/trie structures. Prefix
matching is used for id-matching
Id-matching is based on associative
lookup similar to Linda's tuple space or regular expression
matching
algorithms
-
Binding
Names are ONLY created into name space
when bound for sending/receiving msgs:
- Named_Out: output/send interface bound with name; when a
named_out is created or added into channel, its name is published in name-space; when the
named_out is removed, its name is unpublished.
- Named_In: input/receive interface bound with name; when a
named_in is created or added into channel, its name is subscribed in name-space; when the
named_in is removed, its name is unsubscribed.
Name binding sets:
- for Named_Out, its binding_set is the set of bound Named_Ins to
which to send messages
- for Named_In, its binding_set is the set of bound Named_Outs from
which to receive messages
Binding_sets are decided by two algorithms:
- id matching
- scope and membership matching
There are four kinds of binding_sets: 1-1, 1-N, N-1, N-M.
-
Interface
The interface of a function is :
- input: input arguments and
- output: output arguments and return values.
Similarly, a channel's interface to outside world (and other channels)
is:
- input: globally subscribed names/ids - the set of Named_In with
global/remote scope
- output: globally published names/ids - the set of Named_Out with
global/remote scope
Elements of Protocol
Specification
Qute from Gerard J. Holzmann's DESIGN AND VALIDATION OF
COMPUTER PROTOCOLS:
"A protocol specification consists of
five distinct parts. To be complete, each specification should include
explicitly:
- The service to be provided by the protocol
- The assumptions about the environment in which the protocol is
executed
- The vocabulary of messages used to implement the protocol
- The encoding (format) of each message in the vocabulary
- The procedure rules guarding the consistency of message exchanges"
The following discussions will be divided into these five parts.
Protocol Service
Overview
When 2 channels (A & B) are connected (or mounted), their
name-spaces will interact according to this protocol so that
transparent distributed message passing can be achieved. The
interactions specified in this protocol will have one result - the
name-spaces of connected channels are "merged":
- names flowing from B->A: the intersection of A's input
interface (its set of
Named_In with global/remote scope - global subscriptions) and B's
output interface (its set
of Named_Out with global/remote scope - global publications)
- names flowing from A->B: the intersection of B's input
interface (its set of
Named_In with global/remote scope - global subscriptions) and A's
output interface (its set
of Named_Out with global/remote scope - global publications)
- newly created names/ids are propogated to connected channels
according to its id / membership / scope.
- when channels are disconnected (either intentionally or caused by
connection failures), channels' namespaces will be updated
automatically so that all publications and subscriptions from
remote channels will be removed.
The purpose is that peers in channel A can communicate with remote
peers in channel B "transparently" in similar way as with the local
peers in channel A.
Filter, translator can be specified at connections among channels to
control name space merge:
- filter: decide which ids are allowed to be exported/sent to
(visible at) remote
channels and which remote ids are allowed to be imported to local name
space
- translator: allow translation of ids imported to local name space
and ids exported to remote name space; so we can relocate the imported
remote name space ids to a specific subspace in the local name space,
similar to the way that in distributed file systems, a remote file
system can be mounted to a specific point in local file system.
Please note that channel A will not
automatically propogate the
names/ids it receive from channel B to channel C (suppose that channel
A connects to channel B and to channel C, and there is no connection
between channel B and C). If peers in channel B need to exchange
messages with peers
in channel C, either channel B should directly connect to channel C, or
some peer at channel A should do the forwarding between B and C.
Assumptions About
Environment
The basic assumption is that the transport connection between channels
deliver
messages reliably. However when we build distributed
applications with mechanism for redundancy and failure-detection with
retry, high level
logic can provide capability to guard against transport failures.
Protocol Vocabulary
There are two categories of message types or ids:
- Application messages:
for implementing application
functionalities; different applications will have different set of
message types.
- System messages:
for name-spaces management and new messages
ids propagation, which is the subject of this specification.
There are eight system messages which can further divided into two
parts:
- messages for connection set up, tear down and initial name-space
merge
- channel_conn_msg: the
first message to send when one channel
connects to another channel.
- channel_disconn_msg: the
message to send when one channel
actively disconnect from another channel.
- init_subscription_info_msg:
the message to send one channel's
full set of global subscriptions.
- connection_ready_msg: the
message to notify that the initial name
space merge is complete and ready for normal operations
- messages for ids/names propagation during normal operations:
- subscription_info_msg:
the message to send new global
subscription to connected channels.
- unsubscription_info_msg:
the message to remove global
subscription from connected channels.
- publication_info_msg: the
message to send new global publication
to connected channels.
- unpublication_info_msg:
the message to remove global publication
from connected channels.
Message Format
Here we use generic terms to describe the formats. Different transports
may use different formats and encodings, such as plain text, xml based,
or binary formats.
A general format is used to marshal/demarshal all messages to/from
transport layer frames:
[id_len, id_data, message_len,
message_data]
Applications will define specific message data formats for application
message types/ids. The eight system messages use the following 2
message data formats:
The connection related messages (channel_conn_msg / channel_disconn_msg
/ connection_ready_msg) use the following data formats:
struct channel_info_msg_t {
string host_addr;
//other connection info in future:
//1. protocol version
//2. authentication
...
}
The pub/sub related messages (init_subscription_info_msg /
subscription_info_msg / unsubscription_info_msg / publication_info_msg
/ unpublication_info_msg) use:
struct pubsub_info_msg_t {
vector<id_type> msg_types;
}
Protocol Operations
(Procedure Rules)
-
Channel connection
When a channel A starts connecting to
channel B, the following handshaking will be happen:
- channel A sends a channel_conn_msg containing its host address
(and more identification and authentication info in future) to
channel B.
- when channel B receives A's channel_conn_msg, it returns a
channel_conn_msg with its own host address to channel A. In future we
can add code for connection authentication.
- when A receives B's channel_conn_msg, A will send to B an
init_subscription_info_message. This message contains all of A's
current global subscription ids, which are not blocked by A's filters
and are translated by A's translator.
- when B receives A's init_subscription_info_msg, B will carry out
two jobs:
- for each message id "N" contained inside this message, B will
translate it using B's translator and check if it is blocked by B's
filter, then check if it matches one of B's global publication
ids; if so, two actions will be performed:
- a remote (MEMBER_REMOTE) Named_In with id "N" and
SCOPE_LOCAL will be added at channel B which will forward messages from
local peers to channel A
- a publication_info_msg with id "N" will be sent to channel
A.
- B will send its own init_subscription_info_msg to A.
- when A receives B's init_subscription_info_msg, A will carry out
two jobs:
- for each message id "N" contained inside this message, A will
translate it using A's translator and check if it is blocked by A's
filter, then check if it matches one of A's global publication
ids; if so, two actions will be performed:
- a remote (MEMBER_REMOTE) Named_In with id "N" and
SCOPE_LOCAL will be added at channel A which will forward messages from
local peers to channel B
- a publication_info_msg with id "N" will be sent to channel B.
- A will send its own connection_ready_msg to B.
- when B receives A's connection_ready_msg, B's interface will
transition into "active" state and return a connection_ready_msg to A.
- when A receives B's connection_ready_msg, A's interface will
transition into "active" state.
- during connection setup handshaking, all outgoing application
messages are buffered and then resent when the interfaces transition
into "active".
-
Channel
disconnection
When channel A disconnect
from channel B because of either intentional disconnect or transport
connection failure, the following will happen:
- at channel A, the connection object and the interface object will
be destroyed
- a channel_disconn_msg will be sent to local name-space at
channel A
- when the interface object is destroyed, all publications and
subscriptions from remote channel B are removed from A's namespace.
- a channel_disconn_msg will be sent to channel B
- at channel B, when receiving channel_disconn_msg from channel A,
or detecting transport connection failure
- the connection and interface object at B will be destroyed
- a channel_disconn_msg will be sent to local name-space at
channel B
- when the interface object is destroyed, all publications and
subscriptions from remote channel A are removed from B's namespace.
-
Publication of
names
if a new local (MEMBER_LOCAL) Named_Out
with id "N" is added
(name "N" is published) with global/remote scope in channel A, channel
A will send publication_info_msg containing "N" to all connected
channels. If channel B receives this message, it will check its name
space. If there is local (MEMBER_LOCAL) Named_In with id matching "N"
(using the above discussed id matching algorithms defined with
id_trait) and global/remote scope, the following will happen at channel
A and B:
- at channel B's interface:
- a remote (MEMBER_REMOTE) Named_Out with id "N" and
SCOPE_LOCAL will be added at channel B which will forward messages from
channel A to local peers
- a subscription_info_msg with id "N" will be sent to channel
A.
- at channel A's interface:
- after receiving subscription_info_msg with "N" from channel
B, a remote (MEMBER_REMOTE) Named_In with id "N" and SCOPE_LOCAL will
be added at channel A, which will forward messages from local Named_Ins
with id "N"
-
Unpublication of
names
if a local (MEMBER_LOCAL) Named_Out
with id "N" and global/remote scope is removed
(name "N" is unpublished) in channel A, channel
A will send unpublication_info_msg containing "N" to all connected
channels. If channel B receives this message, the following will happen
at channel
A and B:
- at channel B's interface:
- remote (MEMBER_REMOTE) Named_Out with id "N" and SCOPE_LOCAL
will be removed from channel B's name-space
- a unsubscription_info_msg with id "N" will be sent to channel
A.
- at channel A's interface:
- after receiving unsubscription_info_msg with "N" from channel
B, remote (MEMBER_REMOTE) Named_In with id "N" and SCOPE_LOCAL will
be removed from channel A's name-space
-
Subscription of
names
if a new local (MEMBER_LOCAL) Named_In
with id "N" is added
(name "N" is subscribed) with global/remote scope in channel A, channel
A will send subscription_info_msg containing "N" to all connected
channels. If channel B receives this message, it will check its name
space. If there is local (MEMBER_LOCAL) Named_Out with id matching "N"
(using the above discussed id matching algorithms defined with
id_trait) and global/remote
scope, the following will happen at channel A and B:
- at channel B's interface:
- a remote (MEMBER_REMOTE) Named_In with id "N" and SCOPE_LOCAL
will be added at channel B which will forward messages from
local peers to channel A
- a publication_info_msg with id "N" will be sent to channel
A.
- at channel A's interface:
- after receiving publication_info_msg with "N" from channel
B, a remote (MEMBER_REMOTE) Named_Out with id "N" and SCOPE_LOCAL will
be added at channel A, which will forward messages from channel B to
local Named_Ins
with id "N"
-
Unsubscription of
names
if a local (MEMBER_LOCAL) Named_In
with id "N" and global/remote scope is removed
(name "N" is unsubscribed) in channel A, channel
A will send unsubscription_info_msg containing "N" to all connected
channels. If channel B receives this message, the following will happen
at channel
A and B:
- at channel B's interface:
- remote (MEMBER_REMOTE) Named_In with id "N" and SCOPE_LOCAL
will be removed from channel B's name-space
- a unpublication_info_msg with id "N" will be sent to channel
A.
- at channel A's interface:
- after receiving unpublication_info_msg with "N" from channel
B, remote (MEMBER_REMOTE) Named_Out with id "N" and SCOPE_LOCAL will
be removed from channel A's name-space
References
- DESIGN AND
VALIDATION OF
COMPUTER PROTOCOLS
- Plan 9
Remote Resource Protocol
- RFC 2119, "Key words
for use in RFCs to Indicate Requirement Levels"
- Channel design document: "Channel
- A Name Space Based C++
Framework For Asynchronous Distributed Message Passing and Event
Dispatching"
- Channel project website