System Science - Part 1: Connection Methods
The audio interface might be the most important piece of kit in your entire studio. But with so many on the market, how do you know which one is right for you?
USB, PCIe, Thunderbolt or Ethernet?
The audio interface might be the most important piece of kit in your entire studio. But with so many on the market, how do you know which one is right for you? This series of Web articles explains some of the key aspects of interface design, giving you the information you need to make an informed choice. First of all, we look at the different ways interfaces can connect to computers. What’s the difference between USB 3, USB-C, Thunderbolt and other standards, and which one is best for you? Here’s our guide through the maze of protocols and connectors.
Out in the Open
Computers are designed to be expandable. Manufacturers can’t predict how large a screen you might want, or what sort of keyboard you like, or whether you’ll want to add a graphics tablet, a scanner, or a high-performance audio interface. So, where possible, they leave these choices up to the user. If all of these different peripherals had their own different ways of connecting, no computer could work with all of them. The solution is to come up with universal connection systems. Rather than develop a socket specially for printers, we create a standard socket that anything can connect to, as long as it speaks the right language. We can use it to attach a printer, but we can also connect a graphics tablet, a soundcard, a games controller, an external hard drive or a novelty photo frame.
Only you know what devices you might want to attach to your computer. Standard protocols mean computer manufacturers don’t have to choose for you. In an ideal world, there would perhaps be just one universal system for connecting all hardware. We don’t live in an ideal world, so computers sport a wide range of different connectors. In more or less common use today, you’ll find USB 2.0, 3.0 and 3.1; Thunderbolt 1, 2 and 3; Gigabit Ethernet; PCIe… So what’s the difference, and which ones are best for audio recording?
Bits And Bandwidth
At the most basic level, what all these connections do is to make an electrical link between the computer and the other device. This link is then used to transfer digital data: high and low voltages that represent binary digits or ‘bits’. A microchip called a controller at either end of the link takes care of sending and receiving these ones and zeroes, but as far as the controller is concerned, all that matters is that a string of ones and zeroes has successfully been transmitted. It doesn’t know or care whether those bits add up to a photo of your kids, a chapter of your latest novel, or the best vocal take of your life. The number of bits that can be sent over a connection every second is known as its bandwidth, and historically, not all types of connection had a bandwidth that was adequate for audio recording. The original USB 1.1 specification, for instance, offered just enough bandwidth to send stereo 24-bit, 44.1kHz audio in both directions at once. That was all right if you just wanted to record the odd vocal or guitar part, but not much use if you were tracking an entire band live on the studio floor. Thankfully, those days are behind us, and all of the connectors you’re likely to find on a modern computer offer enough bandwidth for big multitrack recording projects. However, this bandwidth can still be eroded if that connection is shared by multiple devices.
In Real Time
With some types of computer peripheral, it’s more important to transfer data reliably than to transfer it continuously. When we’re backing up to a hard drive, it’s not disastrous if things occasionally get held up: what really matters is that the data gets there intact in the end. For sound recording, though, the mere ability to transfer large amounts of data isn’t enough. We need to maintain a continuous stream of data, and if that stream is interrupted, we can’t afford to wait around while the missing bits are tracked down. Most connection types offer different transfer modes to reflect these different priorities. Data sent to a USB hard drive, for instance, would be transmitted as a ‘bulk transfer’. This offers no guarantees as to how long the process will take, but incorporates robust error correction that ensures any lost or corrupted data gets re-sent. A USB audio interface, by contrast, would transmit its data as an ‘isochronous transfer’. In this mode, the data is transferred in real time, but if any of it gets lost or damaged, the transfer simply ploughs on regardless.
The difference between a bulk transfer and a real-time streaming (isochronous) transfer. Bulk transfers are sporadic, and would wreak havoc on your pristine audio streams.
Even in an isochronous transfer, however, the devices at either end of the link have to package up the data, transmit it and then unpack it again. This takes time, and in a typical audio recording scenario, it happens twice: once to data on its way into the computer, and again on the way out. If this process takes more than a few milliseconds, it can play havoc with a recording session. Performers relying on hearing themselves through headphones can be completely thrown if their voice or instrument comes back to them audibly delayed. This delay is called latency, and keeping it to an acceptably low level is one of the biggest challenges in designing an audio interface. Some connection standards make this challenge easier to meet than others, but good low-latency performance is ultimately down to the design of the interface and its associated software drivers. A well-designed interface that uses one standard will out-perform a poorly designed interface that uses another, whatever the theoretical advantages of the latter.
In the case of PCIe, the electrical connection between computer and expansion devices is achieved by plugging the expansion card directly into a slot on the computer’s motherboard. Other types of connection use cables to join the computer and the expansion device, and in general, they all require their own individual connectors and cables. However, it’s not as simple as every way of connecting having one single connector type. For example, the original FireWire 400 specification supported two different types of connector: a larger six-pin one, and a smaller four-pin one for laptops and mobile devices. When the specification was upgraded as FireWire 800, a third connector type appeared. USB 2 and 3.0 are even more complicated. Most of the other standards that are current today are ‘peer to peer’, meaning that they treat all the connected devices as equals. USB uses a different model where the computer is a ‘master’ and all other devices are ‘slaves’, and to emphasise that distinction, USB 2 and 3.0 cables have different plugs at either end. Not only that, but at the ‘slave’ end, there are numerous different ‘Type B’ connectors available to suit different devices.
Cable types (top to bottom): Thunderbolt 1 & 2 using the MiniDisplayPort connector. USB 1 & 2 (orange). USB 3 (note the blue plug interior). USB-C / Thunderbolt 3. Ethernet/RJ45 — usually for networked audio applications.
Type C For Confusion
The potential for confusion grows ever greater when we consider Thunderbolt and USB 3.1. Instead of having a single standard with several different connector types, we now have single connectors that can support several different standards.
When Apple and Intel first cooked up Thunderbolt, the original idea was that it would use optical rather than electrical cables. By the time it was actually implemented on real computers, that plan had changed — but instead of introducing a new connector, the first and second generations of Thunderbolt used the existing ‘mini DisplayPort’ plug and socket. This made sense, because Thunderbolt was in some ways not really a new standard at all, but a container for two existing standards, being intended as a way of creating an external PCI Express connection while incorporating DisplayPort for connecting screens. In the meantime, the consortium behind USB was working on a new standard called USB 3.1, which differs from its predecessors in several ways. One of the most obvious of these is that whereas a USB 2 or 3.0 cable had to have a Type A plug at the computer end and a Type B plug at the other end, USB 3.1 cables are all reversible, and have the same plug at either end. This is yet another new connector and is known, predictably enough, as a Type C connector. (There are currently few, if any, audio interfaces with Type C sockets.)
Intel’s response was to extend the remit of Thunderbolt even further, so that the new Thunderbolt 3 standard actually incorporates USB 3.1 and HDMI alongside DisplayPort and PCI Express. As part of this move, Intel switched from the mini DisplayPort connector to the USB Type C connector. This did away with one of the major disadvantages of Thunderbolt 1 and 2, which is that the cables themselves have to incorporate dedicated microchips to boost and detect the signal, making them much more expensive than typical USB, FireWire or Ethernet cables.
USB 3.1 and Thunderbolt 3 connections thus use the same cable and plug types, but although you can connect a USB 3.1 device to a Thunderbolt 3 socket on your computer and expect it to work, the reverse is not true. So it’s vital to realise that merely because your computer has a Type C socket does not mean it will necessarily be able to connect a Thunderbolt audio interface. Thunderbolt-compatible sockets should bear the Thunderbolt logo; if yours doesn’t, assume it’s ‘just’ a USB 3.1 socket. At the time of writing, all the Thunderbolt interfaces on the market still use the original mini DisplayPort connectors, so will need an adaptor if they are to be connected to a Thunderbolt 3 Type C port.
As we mentioned earlier, PCIe devices actually live inside the computer, on cards inserted into slots in its motherboard. Compared with some external connection protocols such as USB or FireWire, PCIe cards integrate on a more basic level into the architecture of the computer. In fact, most other connection types can ‘piggyback’ on PCIe. If, for instance, we wanted to increase the number of USB ports on our computer, we could install a PCIe card to add those connections.
Front/Centre: open PCIe slots on a PC motherboard. PCIe cards fit inside the computer and integrate into its architecture at a relatively low level.
The privileged status of the PCI Express card means that from a purely technical point of view, it can support excellent performance for audio interfaces, offering huge channel counts and the potential for very low latency. Yet dedicated PCIe audio interfaces are relatively thin on the ground, and tend to be confined to the higher end of the market. One reason for this is simply that many computers don’t have the slots to put them in. Desktop and tower PCs with PCIe slots make up less than half of total computer sales, and PCIe is not available on any laptop or tablet computer, nor on any current Apple machine.
Another reason is that most of us don’t actually want our audio interface to be inside our computer. Musicians need easily accessible sockets for connecting microphones, loudspeakers, headphones and so on. The backplane of a PCIe card isn’t easy to get at, and doesn’t have enough space for them all. Most PCIe audio interfaces thus don’t place audio connectors directly on the card; instead, they come with external boxes that handle things like analogue-to-digital conversion, metering, and connecting audio gear. This is the basis of several high-end professional solutions, including the popular Pro Tools HDX system from Avid, but it’s relatively complex and expensive to implement.
PCIe is used mostly by high-end modular systems such as Avid’s HDX, which combines PCIe cards with external rackmountable converters, such as the Focusrite Red 8Pre, as shown here.
A third reason is that many of the advantages of PCIe can now be had in another way. As we’ve already seen, the idea behind the Thunderbolt protocol is to ‘externalise’ PCIe, allowing devices outside the computer to hook into its internal architecture in the same privileged way. It is not yet clear whether Thunderbolt audio interfaces can quite match the low-latency performance of the very best PCIe cards, but any remaining advantage to the latter would seem to be outweighed by the greater convenience of the Thunderbolt format.
Despite having been designed precisely with applications such as audio recording in mind, the Universal Serial Bus protocol has had a sketchy history in this department. From USB 2.0 onwards, the standard has offered enough theoretical bandwidth for serious multitrack recording, but the real-world low-latency performance of USB interfaces has been variable. The best of them, including Focusrite’s second-generation Scarlett range, come pretty close to matching what’s possible with Thunderbolt or PCIe. Others struggle to deliver latency figures low enough for real-time monitoring, or do so only with very high CPU loads. This is a crucial point to research when you’re choosing a USB interface.
Although there are now some audio interfaces that are marketed as ‘USB 3’ devices, this is sometimes just a branding exercise designed to stop them appearing behind the times. Most of these are 100 percent compatible with USB 2 and don’t necessarily use either the additional bandwidth or the new ‘Super Speed’ transmission type specified in USB 3. It remains to be seen whether genuine USB 3 interfaces will improve on the low-latency performance of USB 2 devices, but the early signs are that they won’t.
Even USB 2, however, has plenty to offer when manufacturers get their driver software right. For one thing, it is genuinely universal: you’d have to look pretty hard to find a desktop, tower or laptop computer that doesn’t have USB ports. It’s also backwards-compatible — USB 2 devices should work as advertised on USB 3 ports, and on USB-C ports with an adaptor — and apparently future-proof, especially now that Thunderbolt has become a container for USB 3.1. If you buy a USB interface now, it will be a long time before changing fashions in computer design leave you unable to connect it to anything. What’s more, many USB interfaces are now ‘class compliant’. This means they will work with the generic USB audio drivers built into Mac OS and iOS, so you can use these interfaces not only on Macs but also on iPads.
Perhaps the most compelling argument for USB is cost. Because it’s such a universal standard, USB controller chips, sockets and other components are very cheap, so in general, an audio interface that connects over USB will typically cost less than a Thunderbolt interface with otherwise identical features.
At the same time, however, USB does have some inherent disadvantages. One is that, compared with other connection protocols, it is difficult to implement ‘cascading’ or ‘daisy-chaining’ of peripherals. This means that you’ll need either a USB hub or additional USB ports on your computer if you want to run many USB devices at once. It also means that, with a few exceptions, USB audio interfaces generally can’t be run in multiples to add extra inputs and outputs.
Although multitrack audio recording on a computer can be challenging, the amount of data involved is relatively modest compared with, say, the capture of high-definition video footage. It was these very intensive tasks that prompted the development of an externalised version of PCIe, and the latest version of this new standard, Thunderbolt 3, can shift a mind-boggling 40 Gigabits per second. In principle, Thunderbolt combines the advantages of PCIe — very high data bandwidth and privileged low-level access to the internal architecture of the computer — with the benefits of a well thought-out cabled protocol. Like USB interfaces, Thunderbolt devices can be ‘hot plugged’, meaning that you don’t need to shut the computer down before attaching or disconnecting them. Like USB, the Thunderbolt cable carries power as well as data. And unlike USB or PCIe, Thunderbolt permits multiple devices to be connected in a ‘daisy chain’. In short, Thunderbolt has the capacity to allow us to record as many tracks as we’re ever likely to need, with latency figures that are potentially as low as are achievable with PCIe cards. And, in stark contrast to early USB or FireWire interfaces, Thunderbolt interfaces have delivered on this potential right from the start. So why aren’t we all using them? One reason is cost. USB is not only a relatively mature technology, but also one with a huge mass-market take-up. This means that USB components are available very cheaply, and that there are off-the-shelf drivers and other software that can be licensed.
Making the switch to a new technology such as Thunderbolt commits manufacturers to significant investment in development, and to paying a premium for any hardware or software they license. With Thunderbolt 1 and 2, there was also that pesky active cable to consider; understandably, consumers baulked at paying £50 or so for what looked like a short piece of plastic-coated wire. Another is availability. Apple’s enthusiastic commitment to Thunderbolt has not been mirrored by PC motherboard manufacturers; and whereas it’s possible to retrofit USB or Ethernet ports to a computer by installing the appropriate PCIe card, this is not true of Thunderbolt, which can only be added if the motherboard has built-in support for it. For quite a while, it looked as though Thunderbolt would remain limited to Macs and a few specialist Windows machines. Although it is too soon to be certain, however, the signs are that Thunderbolt 3 is at last making serious inroads into the PC market.
Computers don’t only need to talk to peripherals such as printers and audio interfaces. They also need to talk to other computers, and whereas other standards such as USB and Thunderbolt are primarily designed for adding additional hardware to a single computer, Ethernet has been developed mainly for communication between computers. So, although it offers plenty of bandwidth for audio recording, it wasn’t originally designed for that purpose, and there are many technical challenges that need to be overcome in making ‘audio over IP’ work properly. As a result, although Ethernet is the oldest connection protocol in common use today, and one of the most ubiquitous, it’s only in recent years that it has become a viable option for studio recording.
The ubiquitous RJ45 connector has been a fixture on computers for many years, but it’s only recently that Ethernet has become a popular option for studio recording. Here, a rack of Focusrite RedNet MP8R mic preamps are connected to the Dante network via conventional RJ45 Ethernet cables.
The great strength of Ethernet is also, as far as recording is concerned, its Achilles heel. An Ethernet network can allow tens or even hundreds of computers to communicate, but it can’t anticipate when each one of them will need to send data. The upshot is that it’s hard to guarantee an arrival time for any given packet of data, because the network might turn out to be busy when we want to send it. In other words, it’s difficult to establish the kind of real-time data transfer we need for audio streaming.
Audio equipment manufacturers have come up with numerous ways around this problem. Many of them use the Ethernet medium in non-standard ways, and require a dedicated network that is not shared with non-audio traffic. By contrast, Dante and AVB exploit developments within the Ethernet standard itself to send audio as normal network traffic. It is likely that the next few years will see one of these two become the industry standard for Ethernet audio, though it’s not yet clear which one will win out: Dante has a clear commercial lead, but AVB has the potential plus of being an open and non-proprietary system.
One of the big advantages Ethernet has over USB or Thunderbolt is that it can connect over much longer distances. USB Type C cables are restricted to 2 metres for USB 3.1 operation, and Thunderbolt 3 can only achieve its full potential over cables of less than half a metre in length. Ethernet cables, by contrast, can be up to 100 metres long. Ethernet can also supply power: Focusrite’s RedNet AM2 Dante headphone amp is an example of an audio device that can be powered in this way.
The biggest unique advantage of Ethernet audio, however, is its scalability. In practical terms, the other standards described above are only useful for simple setups where a single audio interface is exchanging data with a single computer. In an Ethernet network, however, there is almost no limit on how many devices can be connected. Multiple computers on a network can access the data from a single audio interface, or a single computer can simultaneously record audio from multiple interfaces in different live rooms. If we need more headphone outputs in the middle of a session, we can simply plug additional Dante or AVB-enabled headphone amps into the network.
Block diagram of complex networked studio installation with multiple live and control rooms connected over Ethernet.
Hybrid Systems Although the network sockets built into typical computers are usable for audio over Ethernet, it’s usually necessary to install a purpose-designed Ethernet card for best results. A Dante system using Focusrite’s RedNet PCIe card, for example, can achieve very good low-latency performance. However, fast Ethernet connectivity can also be added in other ways, and this fact has been exploited to create interfaces that combine both USB or Thunderbolt audio and audio over Ethernet. In effect, a single unit acts both as a Thunderbolt or USB audio interface, and as an Ethernet switch or hub for connecting Dante or AVB audio devices. This concept is the basis of Focusrite’s Red 4Pre, Red 8Pre and Red 16 Line, and is one of the key developments that is bringing Ethernet audio to a wider market.
Focusrite’s Red interface range offers the joint benefits of fast, easy-to-use Thunderbolt interfacing and expandable Dante networking.
Most of the many audio-over-Ethernet protocols in existence were designed with the needs of major broadcasters, live-sound venues and installation clients in mind, because the key benefits of scalability and long cable runs are most relevant in those scenarios. As a result, Ethernet audio setups have tended to be expensive, and they have a reputation for being complex to set up. However, there is now an increasingly wide range of Dante and AVB products suitable for studio use, and hybrid systems such as the Red 4Pre offer an affordable yet powerful point of entry into this world.
One of the main reasons why Intel eventually chose to use electrical rather than optical cables for Thunderbolt was that they can deliver power as well as data. This is an important factor for many add-on devices, including audio interfaces, because it can potentially eliminate the need for those devices to have mains power supplies of their own. So, for instance, USB 2 devices are permitted to draw up to 500mA at 5V from the host computer, while USB 3 raises this to 900mA, and Thunderbolt theoretically supports up to 550mA at 18V. However, it’s important to note that the specifications usually refer to the maximum power that an attached device is allowed to demand: they don’t guarantee that the computer will be able to meet that demand.
It’s therefore possible to design bus-powered USB, Thunderbolt and Ethernet audio interfaces, but manufacturers can’t rely on there being as much power available as they’d like. This restriction can compromise the audio performance of an interface, and for this reason, bus powering is typically used only for small desktop interfaces, where the convenience outweighs potential disadvantages.
Bus powering is highly convenient for small, portable interfaces, but it’s hard to guarantee sufficient power is available for optimum audio performance or for larger devices.
Serial Vs Parallel
Digital data is made up of binary digits or bits. These can be represented any way you like — holes in a punch card, pulses of light in an optical cable — but in most computer connections they are encoded as voltages. For instance, USB2 operating in High Speed mode represents zeroes by a voltage of 0V and ones by 400mV.
Since any given piece of wire can only carry one voltage at a time, the only way to send many bits down one piece of wire is to do so one after the other. If we want to send a larger amount of data, it needs to be broken down into individual bits, which are then sent in order and reconstructed by the receiving device. A connection of this type is known as a ‘serial’ connection.
However, the internal architecture of a computer is based around groups of bits. Eight bits make up a byte, and groups of bytes are sometimes called ‘words’. The same goes for many forms of data, including audio. Each sample in a stream of digital audio is normally represented using words of either 16 or 24 bits. So, in principle, if we connected 16 or 24 wires instead of just one, we could sent an entire sample in one go, instead of having to break it down into its constituent bits and send these one after the other. Given the same clock speed, in other words, we can double the rate at which data is transmitted by doubling the number of wires.
This type of connection is called a ‘parallel’ connection, and on the face of it, you might think that parallel connections are obviously better suited than serial connections for transferring lots of data in a short space of time. In fact, however, all of the connection protocols common on modern computers are serial standards. To achieve a higher bandwidth, it is easier, cheaper and more reliable to simply increase the clock speed of a serial connection — thus sending more bits per millisecond — than to accurately synchronise the sending of data across a parallel link.
Most computer peripherals, and certainly audio interfaces, require a bi-directional connection to the computer. In other words, data needs to travel in both directions between the interface and the computer. But, of course, if we have only a single piece of wire to make that connection, then data can only travel down it in one direction at a time. To achieve bi-directional transfer over a single connection, we need to continually switch the direction of travel, so that ‘packets’ of data travelling in one direction alternate with packets going the other way. This is how USB 1.1 and USB 2 work, and it’s called a ‘half duplex’ connection. By contrast, FireWire, USB 3, Thunderbolt and Gigabit Ethernet are all ‘full duplex’ connections, meaning that they provide separate paths for data travelling in each direction.
Words: Sam Pryor