Everyone who’s streamed videos on YouTube has, at some point, experienced two incredibly annoying issues: either the video pauses to rebuffer or it suddenly becomes pixelated.
Both of these problems occur as a result of unique algorithms that divide videos into manageable parts that download incrementally. This means that if you’re experiencing slow internet, YouTube renders the seconds that follow in the video at a lower resolution to ensure you keep watching without interruptions, thus making the video pixelated.
When you attempt to fast-forward to another section of the same video that’s yet to load, there is a high probability that the video will stall in order to preload that section.
Adaptive bitrate (ABR) algorithms are used by YouTube to provide users with a more reliable watching experience and reduce bandwidth usage. Given that most people don’t view videos all the way through –especially as there are billions of hours of video content transmitted daily – it is unproductive to keep preloading videos that are lengthy for users at every point in time.
Although ABR algorithms have largely been successful, viewers continue to have higher expectations for streaming videos. As a result, expectations are frequently unmet when video streaming services like YouTube and Netflix, or hugely popular platforms for sports live streaming, must make unsatisfactory trade-offs between factors like video quality and the frequency of rebuffering.
According to Mohammad Alizadeh, a professor at MIT, “Studies show that users abandon video sessions if the quality is too low, leading to major losses in ad revenue for content providers.” To avoid such losses, streaming sites and platforms must continually seek out fresh approaches to innovation.
Studies show that users abandon video sessions if the quality is too low, leading to major losses in ad revenue for content providers.
Mohammad Alizadeh, Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology
A solution in sight
In this vein, Alizadeh and his team at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT have created “Pensieve,” a machine learning-driven artificial intelligence system that selects algorithms as a response to network conditions.
In comparison to conventional systems, it has been demonstrated that doing this results in a higher-quality streaming experience with reduced rebuffering.
The smoother streaming made possible by CSAIL’s machine learning approach allows for streaming to adapt better to network situations. In particular, the team’s testing revealed that Pensieve was able to transmit and stream video at rates that consumers scored 10-25% better on crucial “quality of experience” parameters, with 10 to 30% less rebuffering than competing methods.
A content provider’s priorities can also be taken into account when customising Pensieve. For instance, YouTube may lower the bitrate if a person using it on the subway is going to enter a dead zone so that it can load enough of the video to avoid having to rebuffer during the network outage. This is especially powerful when used in conjunction with other technologies like GIF compressors.
According to PhD student Hongzi Mao, who co-authored a similar work with Alizadeh and PhD student Ravi Netravali as the primary author, Pensieve is “versatile for whatever you want to optimise it for.”
“You might even envision a user customising their own streaming experience based on how important resolution versus rebuffering is to them.”
You might even envision a user customising their own streaming experience based on how important resolution versus rebuffering is to them.
Hongzi Mao, Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology
How ABR works
Adaptive bitrate algorithms fall into two categories: buffer-based algorithms, which guarantee that a specific amount of future video is constantly buffered, and rate-based algorithms, which gauge how quickly networks transmit data.
Because neither kind uses information about both rate and buffering, both are constrained. Because of this, these algorithms frequently choose the wrong bitrate, and they must be carefully hand-tuned by professionals to adjust to various network circumstances.
Alizadeh says that modeling network dynamics is challenging. He says that with existing techniques to enhance streaming, such as model predictive control (MPC), one is essentially only as good as their model.
Pensieve does not require a model or any pre-existing notions regarding elements such as internet speed. It continually tests an ABR algorithm as a simulation of a neural network under varied network speed and buffering conditions.
Through a system of rewards and penalties, the system fine-tunes its algorithms. For instance, it might receive a bonus whenever it offers a high-resolution, buffer-free experience, but a penalty when it must rebuffer.
According to Mao, the primary author of the paper, Pensieve “learns how different strategies impact performance, and, by looking at actual past performance, it can improve its decision-making policies in a much more robust way.”
YouTube and other content providers might alter Pensieve’s incentive structure to reflect the metrics they want to emphasise for users. The algorithm may be adjusted to penalise rebuffering over time, for instance, since studies have shown that users are more tolerant of it earlier when watching videos rather than later.
Combining deep learning and machine learning methods
The team put Pensieve to the test in a variety of environments, including on a cafe’s WiFi and a street’s LTE network. Studies revealed that Pensieve could equal MPC’s video resolution, with rebuffering reduced by 10% to 30%.
Pensieve discovered ABR algorithms that were trustworthy enough for actual networks when put to the test using fictitious data in a “boot camp” environment, claims Mao. This kind of stress test demonstrates its ability to generalise well to novel situations encountered in the real world.
He said the researchers’ tests show Pensieve can function adequately even in circumstances it has never encountered before.
According to Vyaz Sekar, a Carnegie Mellon University assistant professor of electrical and computer engineering who was not engaged in the work, “prior approaches tried to use control logic that is based on the intuition of human experts.”
Alizadeh further points out that only a month’s worth of downloaded video was used to train Pensive. He claims that if the research team had access to data set on the same scale as YouTube or Netflix, he would anticipate much greater performance gains. His team’s next endeavor will be to use virtual reality (VR) video to evaluate Pensieve.
“We’re excited to see what systems like Pensieve can do for things like VR,” Alizadeh says. “This is really just the first step in seeing what we can do”.