What is RTP - Real-time Transport Protocol?

RTP - short for Real-time Transport Protocol defines a standard packet format for delivering audio and video over the Internet. It is defined in RFC 1889. It was developed by the Internet Engineering Task Force which created the Audio Video Transport Working group and was first published in 1996. RTP is used extensively in communication and entertainment systems that involve streaming real-time data and media, such as telephony, video conferencing applications, internet protocol television services, and web-based push-to-talk features. This is used in both multicast and unicast network services.

RTP Diagram

The RTP protocol is used in conjunction with the RTP Control Protocol (RTCP) and Session Initiation Protocol (SIP). While RTP carries the media streams (e.g., audio and video), RTCP packets are used to monitor transmission statistics and quality of service (QoS) data such as jitter, packet loss (identified using a sequence number), and round-trip time. RTCP also aids in the synchronization of multiple streams. RTP sessions originate and are received on even port numbers and the associated RTCP communication uses the next higher odd port number. Any port number can be used for RTP traffic although, in general terms, ports used can range between 1024 and 65535. RTP is one of the foundations of VoIP and it is used in conjunction with SIP which assists in setting up the connections across the network.

What are the advantages and usage of RTP?

As its name implies, the design goal for RTP is the end-to-end streaming in real-time of media-related data across networks and the internet. Because of the nature of how the internet is structured, RTP packets are expected to be received at different time spacings otherwise called ‘jitter’. RTP includes mechanisms for jitter compensation, packet loss detection, as well as out-of-order data packet delivery. To achieve this, RTP compensates by prioritizing the quick delivery of packets instead of making sure all data packets are received.

An example of this is when someone watches a video online. The video stream would use RTP to send the video data to the user's device. If some of the data packets are lost or delayed, RTP corrects the error resulting in the loss of a few frames or a fraction of a second of video. The knock-on effect could be so small that it is not even noticed by the user.

As RTP enables data transfer to multiple destination end-points in parallel via IP multicast, it is the primary standard employed for audio and video IP network transfers. The mechanisms for the associated profile and payload format, referenced in the design of the RTP architecture, are implemented on the level of the application layer, instead of the operating system layer.

RTP use in VoIP applications

Applications such as VoIP that need to employ real-time streaming of multimedia data, typically require the timely delivery of data, with varying tolerance in packet loss. As an example, audio packet loss in a VoIP application can cause the loss of a few milliseconds of audio data. This loss can be appropriately handled by error compensation algorithms to make it insignificant and imperceptible to the caller(s). TCP (Transmission Control Protocol) is also standardized for RTP use, even though it is not typically employed in applications due to its error-control mechanisms that can cause delays and affect timely packet delivery. For this reason, most RTP applications commonly base their implementations on UDP (User Datagram Protocol).

Further reading