software devI’m a software guy – I view most problems through the lens of software flexibility, algorithms, and user experience design. It’s probably why I was so shocked when I first realized that trillions of pixels were being deployed every year without a thought about how they are to be accessed and shared (past the video cable). Since that moment, we’ve developed and sold software that Intel rightly called ‘software defined displays’ several years ago.

We decided that the Audio/Visual resellers and integrators were the right partners for us. Contrary to popular belief in some software circles – the AV community is capable of selling enterprise software systems, they just hadn’t been given much of a chance. In the past 18 months, we’ve been proven right. Our AV partners are able to sell, configure, and support Solstice – our software to transform your displays from lonely AV islands into shared, wirelessly accessible infrastructure.

That being said, I have been in meetings where I realized that even a simple technology primer for traditional AV technicians could be really valuable. As the IT/software world continues to eat custom hardware solutions, every member of the AV world should know at least these three things (they’ll at least make you sound hip at the next InfoComm after-hours party):

  1. The Difference between TCP and UDP: Both of these are communication protocols are part of the Internet Protocol (IP) stack that defines how devices communicate over the internet. Both protocols send “packets” between senders and receivers in some agreed upon format.In the case of TCP, the protocol focuses on in-order, lossless delivery of information. It accomplishes this by adding a sequence number to each packet so that the receiver can reassemble the message in the order in which they were sent. In addition, TCP requires a receiver to always send an acknowledgment of receipt when a packet arrives. If the sender doesn’t get an acknowledgment, it knows the packet was lost and needs to be re-transmitted. This is great for reliable delivery when time isn’t of the essence. Think emails over TCP.In contrast, UDP focuses on fast delivery of packets that can arrive out of order and could be lost. It’s typically used for broadcasting real-time video when a few packets being lost may not even be noticed, but real-time speed is required. With many receivers and one sender, it’s also a good option so that not all receivers are crushing the sender with acknowledgements.
  1. What is a GPU and why is it important? If you’re a gamer, you may look carefully at the GPU specs on your PC, but AV technicians should also understand what the Graphical Processing Unit (GPU) is, why it differs from a CPU, and why it matters. The GPU was born out of a standardized approach to rendering graphics to a display. It turns out that displaying a beautiful 1080p resolution image – with realistic lighting effects, shadows, and depth – often requires that the same operation to be applied to a huge number of pixels quickly. Accessing and applying the same operation to different data (pixels for example) is best done with a type of parallel computing architecture called SIMD (single instruction, multiple data). Imagine, for example, adding a brightness mask stored as an array of pixels to an existing image. If your images are 1080p that’s about 2 million pixels + 2 million pixels. In a modern GPU, this and more, can be accomplished in a single clock step. This doesn’t mean that a GPU is limited to processing images. Any problem that can be cast as a SIMD problem can be accelerated in a GPU – from analyzing the stock market, to solving complex problems in pharmaceutical design. GPUs are architected with thousands of ‘cores’ to pull this off, while the focus of a CPU is oriented towards very fast, sequential, and general operations. Given the focus of AV – the GPU is finding increasing use in some very AV-centered products including video distribution and switching, as well as our own wireless content streaming product Solstice.
  1. How does video compression work? While video compression can be complex and deeply mathematical in practice , it’s valuable to understand how it works in principle. Video compression is based on the fact that a digital video signal carries a large amount of redundant information. By removing redundancies (or approximating them) a new, much smaller signal is created. This matters greatly. For example, a Crestron DM system is capable of routing 4K video on the network only because it routes compressed video. If that video was sent uncompressed at 30Hz and you assume a 10-bit color frame, you’d be transmitting 82 Mbits per frame or a whopping 2.4 Gigabytes per second. This simply won’t happen on a real-world network.

InformationSo, how do we get around sending so much information? You should generally think about redundancy in three different categories: Spatial, temporal, and informational. Compression algorithms look for and remove redundancy in each area. In the spatial domain, for example, consider an image that consists of a solid black background for the top 10 scanlines. That’s a lot of redundant information. An uncompressed representation might include the pixel color [0,0,0] for each of these 1920 x 10 values. This could be re-encoded quickly if we change the representation to ‘0,0,0 9,200 times’. Some of this technique, known as ‘Run Length Encoding’, can be found in all modern encoders (for example H.264).

Temporal redundancies become obvious when you think about what video content typically contains – a scene that is only slightly changing over time. Rarely does a scene change completely from frame-to-frame. Consider a camera that is panning across a scene – much of the information contained at one location or block of video, is simply displaced to a new location in the image. There are all kinds of reasons this is only an approximation, for example projective effects, but for this discussion we’ll assume it’s true.  Given this realization, video encoders can simply encode the ‘Block X moved to pixel I,j’. This type of representation allows us to reconstruct the image completely but needs far less data to encode.

Finally, informational compression relies on the fact that certain types of information does not need to be preserved if a human visual system is to be used to view it. You’re far better at recognizing brightness differences than you are at different color hues. Quantization of the color space is very common in compression schemes.

Why does compression matter to the world of AV?  Without it, the revolution we’re now experiencing – that allows video to be transmitted along with other data on the internet – wouldn’t be happening. Our software, Solstice, has been shown to result in well over 500:1 compression of the video signal with very little compromise in image quality. This is partly why analog video, and even video standards, are quickly becoming a thing of the past.

About Christopher Jaynes

Jaynes received his doctoral degree at the University of Massachusetts, Amherst where he worked on camera calibration and aerial image interpretation technologies now in use by the federal government. Jaynes received his BS degree with honors from the School of Computer Science at the University of Utah. In 2004, he founded Mersive and today serves as the company's Chief Technology Officer. Prior to Mersive, Jaynes founded the Metaverse Lab at the University of Kentucky, recognized as one of the leading laboratories for computer vision and interactive media and dedicated to research related to video surveillance, human-computer interaction, and display technologies.

Submit Comment