Online Video with Wowza and Amazon Elastic Transcoder
Performance and data interoperability are critical to the success of any web application. For web apps that need to support video processing – which is inherently compute- and I/O-intensive – these challenges are particularly acute. In this post, I describe some of my experience successfully incorporating video capabilities into a PHP-based web app, leveraging open source technologies and cloud-based services to the greatest extent possible.
Performance and data interoperability are critical to the success of any web application. For web apps that need to support video processing – which is inherently compute- and I/O-intensive – these challenges are particularly acute. In this post, I describe some of my experience successfully incorporating video capabilities into a PHP-based web app, leveraging open source technologies and cloud-based services to the greatest extent possible.
Krzysztof is a skilled Symfony developer with excellent knowledge of Symfony 2, Symfony 3, PHP, and OOP coding.
Expertise
The success and adoption of any web app today is highly dependent on its performance, flexibility, and ease of use.
Especially in today’s ADHD world, users will quickly lose patience with an app if its page loads take too long. For web apps that need to support video processing – which is inherently compute- and I/O-intensive – this challenge is particularly acute. Users are becoming increasingly demanding nonetheless, wanting their videos to be high quality and load quickly, even if running on a smartphone or tablet.
Users are also losing tolerance for web apps that don’t work on their preferred browser or device, or that don’t support the data format they need to load or export. The diversity of video formats that need to be supported therefore also makes incorporating video support into a web app especially challenging.
This post describes how I effectively leveraged open source technologies and cloud-based services to incorporate video capabilities into a PHP-based web app.
Use Case
I was part of a team that needed to develop a YouTube-like website, where registered users could upload and share their videos.
The system needed to allow registered users to upload their videos in a variety of supported formats which would then be converted to a common format (MP4). We also needed to generate a set of thumbnails and an image collage to be used in the video player for showing the frames on a video progress bar.
Things were further complicated by the fact that client requirements prevented us from using any available CDN or transcoding APIs, so we needed to develop our solution from scratch.
Video Upload
Since the upload process itself did not need to be video-specific (we just needed an easy-to-use file upload capability), it made sense to use an existing open source solution rather than rolling our own. We selected jQuery-File-Upload, primarily because it supported two features that were essential in our case; namely, an upload progress bar and chunked uploads.
Chunked uploading enabled us to allow a user to upload a video file of virtually any size (especially important to support video files in HD resolution). With this approach, the file is divided into multiple “chunks” on the front-end which invokes upload action with each data chunk (along with the metadata for each chunk, such as chunk number and total file size). The complete video file is then reassembled on the back-end. Incidentally, including the chunk number in the metadata proved particularly important since some browsers (such as Mobile Safari) have a tendency to transmit the chunks in random order.
Online Video Processing
Video processing can be as simple as capturing frames as still images, or can involve more complex operations such as image enhancement, stabilizing the video stream, and so on. In our case, the only video processing requirements were to (a) extract video codecs and other key metadata and (b) generate a set of thumbnails and an image collage (to be used in the video player for showing the frames on a video progress bar).
FFmpeg – a widely-used, freely-distributed, open source library – was extremely helpful in meeting these requirements. FFmpeg provides a complete, cross-platform solution for recording, converting, and streaming audio and video files. It can also be used to convert videos and do simple editing (e.g., trimming, cutting, adding a watermark, etc.).
For our purposes, we were able to use FFmpeg to split the video into ten sections, and then capture a thumbnail for each section to provide the needed functionality.
Unfortunately, though, there are no PHP language bindings for the FFmpeg library. As a result, the only way to leverage FFmpeg from PHP is to invoke the binary from the command line using system commands. There are basically two ways to use FFmpeg in PHP:
-
libav. Libav is a free software project, forked from FFmpeg in 2011, that produces libraries and programs for handling multimedia data. On Ubuntu, for example, this can be installed with the command
sudo apt-get install libav-tools
. libav commands are compatible with FFmpeg and avconv. PHP needs to have command line access toffmpeg/avconv
to use this programmatically. -
PHP-FFMpeg. PHP-FFMpeg is an object oriented PHP driver for the FFMpeg binary. It can be accessed by simply executing
composer update "php-ffmpeg/php-ffmpeg"
.
We used PHP-FFMpeg since it provides easy access to the FFmpeg functionality we were interested in. For example, the FFProbe
class from this package enables you to receive information about codecs or the length of a particular video file as follows:
$ffprobe = FFMpeg\FFProbe::create();
$ffprobe
->format('/path/to/video/mp4') // extracts file informations
->get('duration');
FFmpeg also makes it easy to save any video frame:
$ffmpeg = FFMpeg\FFMpeg::create();
$video = $ffmpeg->open('video.mpg');
$video
->frame(FFMpeg\Coordinate\TimeCode::fromSeconds(10))
->save('frame.jpg');
More detailed sample code is available here.
One note of caution: Due to some patent laws, not all codecs can be processed by FFmpeg and some formats are not properly (or fully) supported. I remember struggling a couple of years ago, for example, with the .3gp
format when support for feature phones was a must.
Queuing
After getting a video’s codecs and other metadata, we push the video to a FIFO (first in first out) conversion queue. The queue was implemented using a simple cron script which selects a given number of unprocessed videos each time it runs and passes them to a conversion utility (sample source code available here).
The conversion utility invokes FFMpeg to perform the conversion and marks each video as having been processed.
We also developed a simple wait time estimation mechanism, which calculates the average time to convert 1 minute of video. Using this average, we are able to calculate and display for the user the estimated remaining processing time after a video has finished uploading, based on how many minutes of video remain to be processed.
Video Format Conversion
Certain universally recognized formats (such as JPEG and GIF) have emerged for still images that are essentially supported by all devices and image processing software. While some video formats are more common than others, no such universally supported format has yet emerged for videos.
In our case, in addition to needing to convert from a variety of formats to a single common format (MPEG-4), we needed the converted videos to be optimised for streaming to mobile devices.
For video format conversion (at least for our short-term needs), using the cloud-based Amazon Elastic Transcoder was the best option available. In addition to its general ease-of-use, the transcoder service takes care of optimization and all encoding settings. Fortunately, an AWS SDK for PHP is available, which simplifies invoking the service from our PHP code.
Note: Using a cloud-based service like the Amazon Elastic Transcoder is great if you want to get up and running quickly. However, bear in mind that this option can become expensive for your client, especially if their business model is likely to necessitate extensive use of large videos. Another factor to consider is that you should not necessarily assume that your client’s videos or business model will be compatible with the Terms of Service.
Amazon uses its basic storage and compute elements, S3 (Simple Storage Service) and EC2 (Elastic Compute Cloud) – combined with Auto Scaling and SNS (Simple Notification Service) – to provide the ability to scale up and down virtually instantaneously.
Installation of aws-sdk is simple, since Amazon maintains a Composer-installable version of the package. Simply add ”aws/aws-sdk-php": "2.*"
to your composer.json
file and do a composer update
.
Obviously, accessing the Amazon Elastic Transcoder requires an Amazon account, so you’ll also need to set that up if you (or your customer) doesn’t already have one.
Our use of the Amazon Elastic Transcoder service entailed first uploading video files to an appropriate bucket on S3. We then made the transcoder job responsible for decoding and generating a thumbnail which, on completion, posts an HTTP request to the specified address. This does require some configuration in the AWS panel, but it’s quite simple and Amazon provides good documentation on how to do this.
Feel free to make use of our transcoder bundle, which helps simplify integration for Symfony 2. It includes a usage description and offers a controller for quick implementation of a notice service sent by Amazon to collect information about the processed video. A usage example is available here.
In addition, an example controller that handles Amazon notifications is available here which also implements confirmation of a subscription address. The service will first post the URL to visit to confirm that this is valid notification receiver. All then that’s really required is to mark the video as processed. From then on, we can use the transcoded video that is stored in the cloud.
Streaming
Video streaming is a capability that necessitates high performance: User expectations for uninterrupted streaming are high and tolerance for latency is extremely low. This challenge is often exacerbated by the need to stream video to multiple clients concurrently in real-time.
In our case, we needed to support each user being able to make his or her own video channel and begin broadcasting. Our solution consisted of three components:
- Dashboard. Application that serves as a streamer’s dashboard, providing the ability to serve video.
- Viewer. Video client that consumes and displays a video stream.
- Streaming Engine. Cloud-based video streaming service.
In addition to the fact that Video on Demand (VOD) technology is still evolving, another issue we faced was that camera access was not well supported and offered only a P2P connection. Also, our goal was to provide broadcasting online for multiple concurrent users. Furthermore, support for the
getUserMedia/Stream
API (formerly envisioned as the <device>
element) is not consistent yet across modern browsers. Based on these factors, I decided to use Flash technology since it was really the only reasonable choice. Both applications (Dashboard and Viewer) were therefore implemented using Flex and ActionScript.
For the streaming engine, we used Wowza. Although there are other, non-commercial solutions (such as Red5, which is marketed as essentially a dropin replacement for Wowza), in our case commercial product support was an important factor. Also, at least at the time we were building the system, Wowza offered better documentation which was an additional advantage. (Note that you can get a trial version of Wowza for free for 30 days and also there is developer’s trial version that you can use for up to 180 days. But there are some limitations; streaming can only work for two clients and there is limit on the maximum number of connections.)
Wowza Streaming Engine
We used the LiveStream application provided with Wowza. To configure it, leave applications/app_name
empty and in conf/app_name
copy the Application.xml
file from the conf
catalogue. Edit the file to configure the <Streams>
section as follows:
<Streams>
<StreamType>live</StreamType>
<StorageDir>${com.wowza.wms.context.VHostConfigHome}/content</StorageDir>
<KeyDir>${com.wowza.wms.context.VHostConfigHome}/keys</KeyDir>
<LiveStreamPacketizers></LiveStreamPacketizers>
<Properties></Properties>
</Streams>
The key parameter is <StreamType>live</StreamType>
which defines that this will be a stream from a live video feed (e.g., a camera). Note that after editing and saving this file, you will need to restart Wowza.
Flash (Flex/ActionScript) Applications
Flash provides a fully integrated system to connect a camera and microphone to a Wowza streaming server. This is particularly helpful if your experience with ActionScript is limited.
The whole application is essentially based on interaction between the following objects:
- NetConnection. The NetConnection class creates a two-way connection between a client and a server. The client can be a Flash Player or AIR application. The server can be a web server, Flash Media Server, an application server running Flash Remoting, or the Adobe Stratus service.
- Camera. The Camera class is used to capture video from the client system or device camera.
- Microphone. The Microphone class is used to monitor or capture audio from a microphone.
- NetStream. The NetStream class opens a one-way streaming channel over a NetConnection.
First, we connect to the Wowza streaming server using the NetConnection instance and then attach the event listener that will listen for changes in network connection status:
nc = new NetConnection();
nc.connect(serverAddress:string);
nc.addEventListener(
NetStatusEvent.NET_STATUS, // event type
eNetStatus, // listener function
false, // use capture?
0, // priority
true // use weak reference?
);
Here’s a minimalistic example of an event listener that connects the camera and microphone to the streaming server:
private function eNetStatus(e:NetStatusEvent):void
{
switch (e.info.code) {
case "NetConnection.Connect.Success":
camera = Camera.getCamera();
mic = Microphone.getMicrophone();
ns = new NetStream(nc);
ns.publish(streamName, "live");
ns.attachCamera(camera);
ns.attachAudio(mic);
break;
case "NetConnection.Connect.Closed":
// debug trace... display user message
break;
}
The client code is very similar, except we just display video input on the user side. This is done by connecting the stream to Video
object, as shown in this simple example:
if(event.info.code == "NetConnection.Connect.Success")
{
ns = new NetStream(nc);
ns.client = nsClient;
ns.addEventListener(NetStatusEvent.NET_STATUS, nsClient.onNetStatus);
ns.play(streamName);
video = new Video();
addChild(video); // this will display video
video.attachNetStream(ns); // connect NetStream to video
}
Wrap up
Live streaming and video can be expected to play an increasingly significant role in mobile and web applications. It is therefore important for web developers to become familiar with video transcoding, processing, and streaming. Numerous tools, libraries, and services exist today for incorporating these capabilities into web applications. This article shows how we leveraged and integrated a number of these technologies to successfully create a basic YouTube-like site with relative ease.
Kraków, Poland
Member since November 4, 2014
About the author
Krzysztof is a skilled Symfony developer with excellent knowledge of Symfony 2, Symfony 3, PHP, and OOP coding.