<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V4.2//EN" 
               "/usr/share/sgml/docbook/dtd/xml/4.2/docbookx.dtd"
	[	<!-- Add other entries here	-->

<!ENTITY % ISOnum PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN" "ISOnum.pen">
%ISOnum;

<!ENTITY cover SYSTEM "cover.xml">
<!ENTITY pmqueue SYSTEM "pmqueue.xml">

]>

<book>
  <bookinfo>
    <title>Component Description</title>
    <subtitle>POSIX Message Queues</subtitle>
    <copyright>
      <year>2003</year>
      <holder>OCERA</holder>
    </copyright>

    <authorgroup>
      <author>
	<firstname>Sergio</firstname>
	<surname>Saez</surname>
	<affiliation>
           <orgname>DISCA, Universidad Politecnica de Valencia</orgname>
           <address>e-mail: <email>ssaez@disca.upv.es</email></address>
        </affiliation>
      </author>
    </authorgroup>
  </bookinfo>
  
  &cover;

  <chapter>
    <title>POSIX Message Queues</title>
    
    <section>
      <title>Description</title>
      
      <para>
        UNIX systems offers several possibilities for interprocess communication: signals,
        pipes and FIFO queues, shared memory, sockets, etc.  In RTLinux, the most flexible one
        is shared memory, but the programmer has to use alternative synchronisation
        mechanism to build a safe communication mechanism between process or threads. On
        the other hand, signals and pipes lack certain flexibility to establish
        communication channels between process.
      </para>
      <para>
        In order to cover some of these weaknesses, POSIX standard proposes a message
        passing facility that offers:

        <itemizedlist>
          <listitem>
            <simpara>
              <emphasis role="bold">Protected and synchronised access to the message
                queue</emphasis>. Access to data stored in the message queue is properly
              protected against concurrent operations.
            </simpara>
          </listitem>
          <listitem>
            <simpara>
              <emphasis role="bold">Prioritised messages</emphasis>. Processes can build
              several flows over the same queue, and it is ensured that the receiver will
              pick up the oldest message from the most urgent flow.
            </simpara>
          </listitem>
          <listitem>
            <simpara>
              <emphasis role="bold">Asynchronous and temporised
                operation</emphasis>. Processes have not to wait for operation to be finish,
              i.e., they can send a message without having to wait for someone to read
              that message. They also can wait an specified amount of time or nothing at
              all, if the message queue is full or empty.
            </simpara>
          </listitem>
          <listitem>
            <simpara>
              <emphasis role="bold">Asynchronous notification of message
                arrivals</emphasis>. A receiver process can configure the message queue to
              notify him on message arrivals. So such a process can be working on
              something else until the expected message arrives.
            </simpara>
          </listitem>
        </itemizedlist>
      </para>
    </section>

    <section>
      <title>Layer</title>
      
      <para>
        POSIX Message Queues is a message passing facility that relies only on services
        that are already available or that are going to be incorporated by other
        components to the RTLinux core. As they do not require any modification of the
        RTLinux, they can be located at the High-Level RTLinux layer.
      </para>
    </section>
    
    <section>
      <title>API / Compatibility</title>
      
      <para>
        This components follows the POSIX API specification for message passing facility
        defined in IEEE Std 1003.1-2001. This API also belongs to the Open Group Base
        Specifications Issue 6.  The following synopsis presents the list of supported
        message queue functions:

        <programlisting>
int     mq_close (mqd_t);
int     mq_getattr (mqd_t, struct mq_attr *);
int     mq_notify (mqd_t, const struct sigevent *);
mqd_t   mq_open (const char *, int, ...);
ssize_t mq_receive (mqd_t, char *, size_t, unsigned *);
int     mq_send (mqd_t, const char *, size_t, unsigned );
int     mq_setattr (mqd_t, const struct mq_attr *, struct mq_attr *);
ssize_t mq_timedreceive (mqd_t, char *, size_t, unsigned *, const struct timespec *);
int     mq_timedsend (mqd_t, const char *, size_t, unsigned, const struct timespec *);
int     mq_unlink (const char *)
        </programlisting>
      </para>
    </section>

    <section>
      <title>Dependencies</title>
      
      <para>
        This component has been developed for the RTLinux executive version 3.2
        pre-release 1. It depends on several already available RTLinux services, as
        POSIX Timeouts and POSIX Semaphores (including the function
        <function>sem_timedwait</function>), and new POSIX services developed in this
        project, as POSIX Signals, that also improve the behaviour of some already
        present services.
      </para>

      <para>
        Although POSIX Timers are not required, if they are available, timeout parameters
        in functions <function>mq_timedsend</function> and
        <function>mq_timedreceive</function> have to be based on the CLOCK_REALTIME clock,
        as specified in IEEE Std 1003.1-2001.
      </para>
      
    </section>

    <section>
      <title>Status</title>
      
      <para>This component is in the testing stage. Already passed tests are described
        bellow, in <emphasis>Tests</emphasis> section.
      </para>
      
    </section>

    <section>
      <title>Implementation issues</title>
      
      <para>
        POSIX Message Queues implementation does not require to modify the core RTLinux
        executive. So implementation issues are only intended for internal structures and
        algorithms.
      </para>

      <para>
        Several issues have been considered when implementing POSIX Message Queues. They
        are related with memory allocation at queue creation instant, synchronisation
        issues and management of message priorities. Next section presents these issues.
      </para>

      <section>
        <title>Queue creation</title>
        
        <para>
          When a message queue is created, using the <function>mq_open</function>
          function, the required information about queue and messages maximum size becomes
          available.  Then, the maximum memory requirements for operation of the message
          queue are known and the resource reservation can be performed. This was the
          original intention described in the rational of the standard: resource
          reservation can be performed only at one point, i.e., at queue creation.
        </para>

        <para>
          As the original RTLinux executive has no dynamic memory support, message queues
          creation can be performed only when the module is loaded into the kernel. In
          this instant, Linux kernel <function>kmalloc</function> is available for dynamic
          memory reservation and then <function>mq_open</function> function can be
          implemented without problems. This characteristic introduces an additional
          restriction in the original POSIX API, that is already present in other RTLinux
          POSIX functions implementation (e.g. <function>pthread_create</function>).
        </para>

        <para>
          This restriction simplifies a lot the message queue creation process since it
          eliminates synchronisation requirements. The Linux module loading process
          guarantees the atomicity when calling <function>mq_open</function> function, and
          therefore no synchronisation mechanism is needed to achieve mutual exclusion
          when the internal structures of message queue system are modified.
        </para>

        <para>
          A second version of this component can rely on the dynamic memory component,
          providing a less restrictive implementation of the <function>mq_open</function>
          function. Although this option will provide a more flexible API, the new version
          will require to follow the POSIX standard requirements about atomicity of
          message queue opening process. These requirements will introduce an additional
          overhead when opening and creating a new message queue.
        </para>
        
      </section>

      <section>
        <title>Synchronisation issues</title>

        <para>
          Sending and receiving messages requires to use synchronisation mechanism to
          achieve mutual exclusion when the internal structures of message queue system
          are modified. RTLinux executive provides several synchronisation services:
          low-level synchronisation, POSIX Pthread Mutex and Condition Variables, and
          POSIX Semaphores. Each of these options are analysed next
        </para>
        
        <para>
          <glosslist>
            <glossentry>
              <glossterm>Low-level synchronisation mechanism</glossterm> 
              
              <glossdef> <para> This mechanism is based on using spin-locks and
              disabling external interrupts to allow system- wide mutual exclusion
              regions.  Using this mechanism requires to re- implement a semaphore- like
              service and a timeout service for timed send/receive
              functions. Additionally, blocking times when accessing shared data of
              message queues can influence in the scheduling of every thread in the
              system and not only in the threads that can access to a given message
              queue. However, this kind of custom implementations could obtain a lower
              overhead than a generic solution.  </para>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm>POSIX Mutexes and Condition Variables</glossterm> 

              <glossdef> <para> This mechanism is the most elaborated one, and it is
              specially design for the kind of shared access that is perform in POSIX
              Message Queues. Only a very important restriction disallows the use this
              mechanism: threads waiting in a message queue must exit if they receive a
              signal, and threads waiting in a POSIX Mutex remains blocked after the
              execution of the corresponding signal handler. Additionally, there are no
              timed wait in a POSIX Mutex, and therefore, an extra mechanism should be
              design to implement timed send and received functions.</para>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm>POSIX Semaphores</glossterm>

              <glossdef>
                <para> POSIX Semaphores is an intermediate solution between low- level
                mechanism and POSIX Mutexes. However, they have two additional
                advantages over POSIX Mutexes: first, a thread blocked in a POSIX
                Semaphore can be extracted from the waiting queue when a signal arrives;
                and second, a POSIX Semaphore allows to block a thread during an
                specified amount of time. Both functionalities perfectly match the
                requirements that POSIX standard imposes on the behaviour of POSIX
                Message Queues implementations.
                </para>
              </glossdef>
            </glossentry>
          </glosslist>
        </para>

        <para>
          As it can be derived from the analysis showed above, the selected mechanism to
          implement mutual exclusion access to shared data in a message queue was POSIX
          Semaphores. This service was already available in the RTLinux executive, but
          it has been improved when implementing POSIX Signals and POSIX Timers
          components. Now, the improved RTLinux Semaphores conforms the POSIX standard
          providing the services and the behaviour required for implementing POSIX
          Message Queues.
        </para>
      </section>

      <section>
        <title>Sorting of the prioritised messages</title>

        <para>
          POSIX Message Queues standard requires that a receive operation always obtains
          as a result the oldest message with the highest priority. That requires the
          message queue performs some kind of sorting that allows to extract the messages
          in priority order, and within each priority in FIFO order.
        </para>

        <para>
          Several possibilities arise when this sorting mechanism is analysed.  Different
          options have been analysed for the implementation of this component. The
          different options that have been considered are:
          
          <itemizedlist>
            <listitem>
              <para>
                Use a sorted queue that allocates all the pending messages sorted by
                priority and within each priority in FIFO order. This structure can be,
                e.g., a heap of pointers to messages. 
              </para>
              <para>
                This solution provides low memory requirements, but each insertion and
                extraction from the message queue will have a logarithmic computational
                cost respect to the number of messages in the queue.
              </para>
            </listitem>

            <listitem>
              <para>
                Use a sorted priority queue that allocates only one token per priority,
                sorting the queue only by the priority value. Each priority has its own
                FIFO queue that stores the pending messages on that priority. When a
                receive operation is performed, the highest priority token is used to
                select the FIFO queue to extract the message. When a FIFO queue becomes
                empty, the corresponding priority token is removed from the top of the
                priority queue. When the FIFO queue of a new message is empty, the
                priority token have to be inserted in the priority queue. In that way, the
                priority queue only has tokens that correspond with no empty FIFO queues.
              </para>
              <para>
                This implementation provides a constant computational cost for insertions
                and extractions, when the FIFO queue of a given priority is not empty,
                i.e., this solution optimises the FIFO access to the queue. In the worst
                case, when a new message arrives and its FIFO queue is empty, the
                computational cost of inserting the priority token is logarithmic respect
                to the number of active priorities in the queue. This value is always
                equal or lower than the number of messages in the queue, and therefore,
                this approach has a better computational cost than the previous
                solution.
              </para>
              <para>
                On the other hand, the memory requirements of this solution are
                proportional to number of available priorities.
              </para>
            </listitem>

            <listitem>
              <para>
                Use a bitmap to store the priorities used by the pending messages and
                low-level processor-specific instructions to find out the highest priority
                stored in the bitmap.  The rest of the implementation could remain as the
                previous solution (FIFO queues within each priority).
              </para>
              <para>
                This approach should obtain the best trade- off between computational costs
                and memory requirements, but it is also the less portable solution.
              </para>
            </listitem>
          </itemizedlist>

          At this moment, the second approach has been selected as the basic sorting
          mechanism. A future version of this component probably offers all these
          solutions as configuration options.
        </para>
      </section>

    </section>

    <section>
      <title>Tests</title>

      <para>
        Basic conformance tests have been done based on the Open POSIX Test Suite from
        Sourceforge GPL Open Source Project. Suite current release only works on
        functionality level, being other kind of tests like definition or stress tests not
        covered by now.
      </para>

      <para>
        To test at definition level means to test that types defined by POSIX Message
        Queues are correctly defined into include files provided by the
        implementation. POSIX standard only requires two types to be defined:
        <function>struct mq_attr</function> and <function>mqd_t</function>. These types
        have not been tested explicitly, but the tests are implicitly performed when
        testing implementation at functionality level, since both types are used into API
        functions.
      </para>

      <para> 

        Other POSIX required types, as <function>struct sigevent</function> or
        <function>struct timespec</function>, depend on new components or other parts of
        RTLinux, and therefore its correct definition is out of the scope of this
        component. However, its definition has been implicitly tested when tests for the
        corresponding functions <function>mq_notify</function>,
        <function>mq_timedreceive</function> and <function>mq_timedsend</function> have
        been performed at functionality level.
      </para>

      <para>
        Stress test tries to check what is the behaviour of the system when resources are
        massively demanded, or what is the behaviour when a high number of threads are
        sending and receiving messages simultaneously. The first case corresponds when a
        lot of message queues are created and activated at the same time. POSIX Message
        Queues define some limits that are fixed at compilation time, so these kind of
        stress tests are well delimited. Stress tests with several threads using one or
        more message queues structures at the same time have been performed. Since Message
        Queues use RTLinux semaphores for synchronisation and POSIX Signals for
        notification (mainly), the behaviour will be dependent of these components.
      </para>

      <para>
        Functionality tests for every API function have been done, testing different
        possibilities of behaviour: how messages are inserted into queues depending on what
        priority they have, and how are these messages delivered. POSIX standard fixes
        error values returned by functions, so different conformance tests where these
        different error values must be returned have been done. Finally, although most of
        these test can be done only with one single thread, we have preferred to use
        distinct threads when possible to have a more realistic behaviour, instead of how
        Test Suite from Sourceforge does it, using one single process for send and receive
        messages.
      </para>
    </section>

    <section>
      <title>Validation criteria</title>
      
      <para>
        This component provides a new message passing facility that was not present in the
        current version of the RTLinux executive. This new feature allows prioritised
        communication between different hard real time tasks with bounded overheads. This
        facility was highly demanded in the Real Time Linux community. An extension that
        allows the same kind of communication between Linux processes and RTLinux threads
        is already planned.
      </para>

      <para> 

        Performance issues in POSIX Message Queues strongly rely on the performance of
        synchronisation mechanisms and the system memory bandwidth.  This implementation
        uses semaphores as a synchronisation mechanism in order to achieve mutual
        exclusion when accessing to shared data and for blocking sending/receiving threads
        when a given queue is full/empty.
      </para>

      <para>
        A POSIX Message Queues performance evaluation should study how data and
        synchronisation issues are managed by this design. Low-level synchronisation
        mechanism could be used to implement mutual exclusion accesses to shared data, but
        this option extremely complicates the design of timed send and receive functions,
        unnecessarily increasing overhead and implementation clarity of these services.
      </para>
      
      <para>
        On the other hand, POSIX standard requires a two- copies implementation of message
        send/receive process when coping data to/from message queues and this implies an
        extra overhead not required when uses lightweight threads. One of the best ways to
        improve the Message Queues performance could be to extend the POSIX standard API
        with several function that allows a zero- copies version of send/receive process.
        However, the two- copies implementation simplifies the design of a future
        extension of Message Queues for communicating Linux processes and RTLinux
        threads. This issue will be studied in the second phase of components development.
      </para>

    </section>
  </chapter>
</book>
