The purpose of this article is to provide a high-level overview of the SRT open source technology for video streaming. For those seeking more in-depth information, you can view the technical overview on the SRT GitHub repository here: https://github.com/Haivision/srt/files/2489142/SRT_Protocol_TechnicalOverview_DRAFT_2018-10-17.pdf
Originally developed by Haivision (https://www.haivision.com/), SRT stands for Secure Reliable Transport. SRT is a transport layer protocol that helps optimize video streaming performance across unmanaged networks, such as the Internet and public/private Clouds. SRT can stream any type of content, including H.264 and HEVC video. SRT is most helpful when delivering ingest sources from remote events and sites to editing facilities, cloud services, and disaster recovery applications. The “S” in SRT stands for Secure, which means the protocol uses advanced encryption techniques, such as AES-128 and AES-256, to encrypt communication between the sender and the receiver. One of the main advantages of SRT over other transport protocols is its low latency performance that is achieved by configurable latency buffers and smart ACK/NACK messages. As a low latency point-to-point streaming protocol with packet loss recovery, SRT is ideal for first mile applications such as broadcast contribution, back-haul, bidirectional interviews, and return feeds.
Hundreds of technology vendors have adopted SRT, and it continues to evolve as the de facto industry standard for reliable video streaming in place of RTMP which does not include encryption and does not support HEVC. Best of all, SRT is open-source which is being continually developed, with numerous community contributors and is publicly available on GitHub. Telestream is a member of SRT Alliance (https://www.srtalliance.org/), together with over 350 vendors.
How SRT works:
SRT is a UDP-Based protocol, which means it is a lot faster than any TCP based transfer mechanism, but also adds an ARQ (Automatic Repeat Request) mechanism for handling packet loss and retransmission. Unlike TCP, which has to ACK (acknowledge) every single packet, SRT will only send periodic ACKs. This saves bandwidth and speeds up the transmission. An ACK packet will be sent at specified intervals, to acknowledge the receipt of packets. An ACK contains the next sequence number of the next to last received packet. For example, if the receiver sends an ACK for packet 6, this means that all the packets up to 5th (including) have been received and can be removed from the sender’s buffer.
If packet #2 arrives, but packet #3 does not, then NAK will be sent right after the arrival of packet #4.
To avoid lengthy wait times for NAK, increased latency times, and waste of bandwidth, SRT implements ARQ for periodic NAK reports. A report is sent every RTT (round trip time) /2, which helps maintain low latency (although there is a chance that the same packet will be retransmitted twice).
Round-trip time (RTT) is a measure of the time it would take for a packet to travel back and forth. SRT cannot measure one-way transmission time directly, so it uses RTT/2, which is calculated based on an ACK. An ACK (from a receiver) will trigger the transmission of an ACKACK (by the sender), with almost no delay. The time it takes for an ACK to be sent and an ACKACK to be received is the RTT. Another role of ACKACK packet, is to tell the receiver to stop sending ACK confirmations for a specific position. Otherwise, the receiver will continue sending ACK, thinking the sender never received them. This ACK-ACKACK algorithm helps SRT to detect and adapts to the real-time network conditions between the two endpoints.
SRT has two kinds of packets, where the first bit in the PH_SEQNO field in the packet header distinguishes between data (0) and control (1) packets.
Control packets are used for different types of messages between the sender and the receiver.
Type Extended Type Name Description
0 0 HANDSHAKE Handshake packets are used to establish a connection between two peers in a point-to-point session
1 0 KEEPALIVE Keep-alive packets are exchanged approximately every 10 ms to enable streams to be automatically restored after a connection loss
2 0 ACK Acknowledgment packets are used to provide data packet delivery status and RTT information
3 0 NAK Loss Report – Negative acknowledgement packets are used to signal failed data packet deliveries
4 0 Congestion Warning
5 0 Shutdown Shutdown packets initiate the closing of an SRT connection
6 0 ACKACK ACKACK packets are used to acknowledge the reception of an ACK, and are instrumental in the ongoing calculation of RTT
7 0 Drop Request
8 0 Peer Error
0x7FFF – Message Extension
0x7FFF 1 SRT_HSREQ SRT Handshake Request
0x7FFF 2 SRT_HSRSP SRT Handshake Response
0x7FFF 3 SRT_KMREQ Encryption Keying Material Request
0x7FFF 4 SRT_KMRSP Encryption Keying Material Response
SRT supports two connection modes:
Caller-Listener: where one side waits for the other to initiate a connection.
Rendezvous: where both sides attempt to initiate a connection. In this mode, a source port can be specified to help better handle NAT Firewall support.
To initiate a caller-listener session, a “4 packets handshake” exchange is required:
Caller will start with sending an “induction” message
“SYN Cookie” value is used to mitigate a potential DOS (Denial of Service) attack caused by flooding the Listener with handshake commands.
The Listener will respond to the first packet with the same information, but this time it will include a generated cookie value, based on host, port, and current time:
Next, the caller will respond with a URQ_CONCLUSION message using the same cookie value from the previous packet:
The Listener will respond with the same values, but without a cookie this time:
At this point, the connection can be considered as established, and data flow starts.
A configurable passphrase protects the SRT stream on the sender side. On the receiver, a configured passphrase is only used if the incoming media stream is encrypted. A receiver knows that a received stream is encrypted and how (cipher/key length), based on the received stream itself — not because it has a password configured. The keying material is transmitted within the connection handshake packets, and for a short period when rekeying occurs.
Operators can decide which key size they want to use: 128, 192, or 256 bits. This can be configured as fast, medium, or strong encryption.
SRT encrypts the media stream at the Transmission Payload level (UDP payload of MPEG-TS/UDP encapsulation, which is about 7 MPEG-TS packets of 188 bytes each).
Telestream has partnered with Haivision and Microsoft around the SRT Hub – a cloud-service for intelligent media routing across Microsoft Azure addressing global contribution and distribution workflows. SRT Hub can ingest video at Azure data-centre regions worldwide and provide low latency transport of the video stream across the Azure global backbone for use in other regions. The ingest and egress to SRT Hub is based on “Hublets”, including 3rd party Hublets to support numerous types of workflows and applications. One of the key use cases for SRT Hub is to replace satellite backhaul between global locations (including one-to-many locations). The Telestream IQ team is working with Haivision and Microsoft to leverage the IQ solutions for operational monitoring across the global SRT Hub framework. More on SRT Hub can be seen here: https://www.haivision.com/products/srthub/
Written by Michael Demb | Director, Solution Architecture, Strategic Sales at Telestream.