Voice and video conferencing solutions are very popular. The Session Initiation Protocol (SIP) is the key for vendor-independent and future-proof solutions. This blog explains the concepts of SIP-based conference control and media exchange.
Introduction to SIP
Session Initiation Protocol (SIP) is a signalling protocol of Voice over IP systems. Specified in RFC 3261 and member of the TCP/IP protocol family of the Internet Engineering Task Force (IETF), and a protocol of the Internet.
SIP enables users to establish, maintain, modify and release multimedia sessions.
Sessions may transfer information in one direction only (simplex) such as paging systems, sound systems, video surveillance systems and similar, or simultaneous transfer in both directions (full duplex), such as phone calls only for audio or inclusive video.
SIP does not only support point-to-point connections, but also multi-point connections ( “conferences”). The following article describes how these conferences can be implemented using the SIP protocol.
Tasks of SIP
SIP initiates a session or a conference.
It allows a terminal to join a conference or to leave it.
Terminals invite other participants with SIP to a conference or release participants from the conference.
Terminal use SIP to control the type of the media stream, the coding techniques and the like. SIP transmits the status of the conference to conference participants.
Conflicting Tasks of SIP
On one hand, SIP is used for central signaling, on the other hand SIP offers the advantage of a decentralized concept. This means that different partners can manipulate sessions on SIP.
The media stream in the IP network can be selectively transported via unicast sessions or distributed over a multicast-enabled infrastructure.
An important task is the mixture of signals from different sources, so that all conference participants may “listen” simultaneously (if there is a full-duplex transmission). The media mix is done either centrally (in a special “box”) or locally at individual devices.
The conference can start ad hoc or scheduled done. For this, different functions are required (for example, calendar integration).
Participants should be able to dial in from outside into the conference (dial-in) or to be invited as external participant from within the conference (dial-out). Peer-to-peer conferencing is also conceivable.
Invitations of conferees have to be made in a simple way. Also, the release from the conference should be simply done.
At conference end all partners have to be clearly informed that the conference has ended.
Centralized Conference Control
A central conference bridge (Multipoint Control Unit, MCU) controls the SIP dialogues with all relevant partners. The SIP devices (SIP User Agents) “see” point-to-point SIP sessions.
According to RFC 4353 this constellation is called “tightly coupled conference” as a central office has full control over the conference.
Another form of centralized conference control is carried out directly in a terminal. In this case, the terminal is acting as a conference center. In RFC 4353 this is called a “Focus” point.
Decentralized Conference Control
In a mesh network, all stations are equal and communicate via SIP directly among themselves. In RFC 4353, this situation is called a “fully distributed multiparty conference” because each participant has full control over any connection via SIP signaling.
Centralized Exchange of the Media Stream
Each conference participant sends its media stream to the central conference bridge from where it gets the mixed signal of the other participants.
If the conference controller is a terminal, this is also responsible for forwarding the media streams to all conference participants themselves.
Decentralized Exchange of the Media Stream
All equal stations in mesh replace the media stream directly with all other conference participants with each other (via point-to-point connections).
Another variant is the transmission of the media stream via a multicast-enabled network infrastructure. Here rendezvous points are established in the IP network, managing the participation, the join and leave of participants of a multicast group. These points also distribute signals from subscribers to other subscribers.
RFC 4353 SIP Conferencing Framework
RFC 4353 describes “A Framework for Conferencing with the Session Initiation Protocol (SIP)”. The central point of the conference is called “Focus”. The SIP dialog to the “participants” takes place separately for each participant. The “Focus” contains all the functions for a conference. Users of the “Conference Notification Service” receive appropriate status messages if they want to use the service. A Conference Policy Server manages user rights.
Availability of Solutions
In the world market there are a number of manufacturers that offer SIP-enabled conferencing solutions for different application areas and sizes. Here one must distinguish between solutions for providers, large corporate clients, small and medium enterprises and private users.
Every major Voice over IP and unified communications solution provider offers conference systems in different forms for a long time. Whether these systems communicate via SIP or use proprietary protocols must be clarified from case to case.
For several years, cloud-based solutions are booming.
Don´t miss any new blog post and register to Ronald Schlager´s Blog Newsletter.
Books / Courseware:
Seminar „SIP Protocol – Details“
About the author
Ronald Schlager is independent trainer, consultant, book author and blogger about communications technologies and their applications.
Ronald Schlager´s profiles