
Everyone in AI infrastructure is talking about the stack. Nobody is talking about what holds it together.
By now, most people in the industry have heard Jensen Huang's framing of the AI factory: a five-layer stack of chips, systems, networking, software, and applications. It's a compelling model. It explains why NVIDIA isn't just a chip company. It explains why the race to build AI infrastructure is as much about software and systems as it is about silicon.
But there’s a missing layer in that model - something so fundamental that its absence is hiding in plain sight.
None of it works without optical interconnect. It is the layer that makes every other layer function.
Every layer of the AI stack depends on moving data - fast, reliably, at scale. The GPUs don't compute in isolation. The systems don't think alone. The fabric that ties a GPU cluster together, rack to rack, switch to switch, is optical. And at the heart of that fabric is the transceiver.
Consider what actually happens inside a modern AI data centre. Thousands of GPUs running in parallel. Massive model weights flowing across the interconnect during training. Inference requests hitting clusters at scale. Every token generated by a large language model, every image synthesised, every recommendation served - each one is the result of data moving across optical links at extraordinary speed.
A single modern GPU cluster for frontier AI training can require tens of thousands of optical transceiver ports. At 800G and 1.6T speeds, the aggregate bandwidth flowing through those transceivers is measured in petabits per second. This is not a peripheral concern. It is the nervous system of the AI factory and a single point of failure.
And yet, in almost every conversation about AI infrastructure - from earnings calls to conference keynotes to procurement meetings - the transceiver is invisible. It's assumed. It's a given. It’s assumed to be there when needed.
Except right now, it isn't. Demand for high-speed transceivers is outpacing component supply by a significant margin and the gap is widening as AI buildout accelerates globally. The invisible layer has become the critical constraint.
Optical transceivers have always been a background component. At 10G and 40G speeds, they were genuinely commodity - interchangeable, abundant, cheap. The industry built procurement habits around that reality. Buy on price. Buy on availability. Treat them like cables.
At 800G and 1.6T, that assumption breaks down. The physics are different. The component precision required is different. The consequences of failure - in a GPU cluster running at millions of dollars of compute per hour - are very different.
The invisible layer is becoming visible. The question is whether the industry's procurement thinking catches up before something breaks.
The next section examines how the transceiver industry trained buyers to treat critical infrastructure like a commodity - and why that habit is now a liability.
For decades, treating transceivers like interchangeable parts was rational. At 1.6T, it's a liability your procurement team may not know it's carrying.
The previous section introduced the invisible layer - the optical interconnect that ties together every GPU cluster, every AI factory, every layer of the stack.
The risk now is not visibility. It’s behaviour.
For most of the history of optical networking, transceivers were genuinely commodity. At 1G, 10G, even 40G, the technology was mature, the margins were thin, and the components were largely interchangeable. You bought on price. You bought on availability. You qualified a vendor once and assumed consistency thereafter.
The industry built an entire ecosystem around this assumption. Distributors. Private-label resellers. Brokers who could source whatever you needed, wherever they could find it, under whatever brand name moved fastest.
The problem is that the components changed. The buying habits didn't.
Here's what most neocloud procurement teams don't know: a single private-label part number can - and routinely does - conceal multiple completely different products underneath. Different DSP silicon. Different laser sources. Different driver ICs. Different thermal profiles. Different failure curves.
The label stays the same. What's inside does not. And in most cases, nobody tells you. There is no notification when the internal components change. There is no re-qualification requirement. The part number you approved six months ago may bear no physical resemblance to what arrives in your next shipment.
At 10G, this was a manageable risk. At 1.6T - in a GPU cluster running frontier AI workloads - it is not. The wrong DSP variant causes link instability. An incompatible laser source causes premature degradation. A thermal profile mismatch causes failures in high-density deployments. And in infrastructure where a single hour of downtime can cost more than the transceivers themselves, 'manageable' is the wrong frame entirely.
The risk isn't just technical. There's a compliance dimension that most buyers haven't fully reckoned with yet.
Under NIS2, operators of critical infrastructure - and the neoclouds underpinning AI workloads increasingly qualify - carry direct obligations for supply chain security and traceability. That obligation flows to component suppliers. If your transceiver vendor cannot document what's inside their product, cannot trace component origin, and cannot meet audit requirements, your compliance posture has a gap.
And then there's country of origin. 'Made in the US' and 'Made in the UK' claims are widespread in transceiver marketing. They are, in almost every case, inaccurate. You cannot manufacture the core components of a transceiver - the laser chips, the DSP silicon, the photonic elements - outside of China. Assembly and test can happen elsewhere. Manufacturing cannot. Misrepresenting this isn't just misleading. In a post-tariff environment, it creates real financial and legal exposure for the buyer.
The next section outlines what getting this right actually looks like - and why the shift from commodity thinking to critical infrastructure thinking changes how transceivers must be qualified.
The AI infrastructure boom is forcing a long-overdue reckoning with how optical transceivers are qualified, sourced, and trusted. Here's what getting it right actually requires.
The previous sections outlined the invisible layer - the optical interconnect that makes the AI factory function - and the commodity trap that left the industry buying critical components like spare parts. The question now is what it actually looks like to treat a transceiver as critical infrastructure.
The shift starts with a question that most procurement teams have never had to ask about a transceiver: what's actually inside it?
At 1.6T, the silicon inside a transceiver is not an implementation detail. It's the product. The DSP determines power consumption, signal integrity, compatibility, and lifespan. Choosing a transceiver without knowing which DSP it contains is like specifying a server without knowing the processor.
Our 1.6T OSFP module is built around the Marvell Ara 3nm PAM4 DSP - the most advanced DSP at this speed tier, running at 3nm process geometry for maximum power efficiency and signal performance. It operates below NVIDIA's power budget threshold, which in a dense GPU cluster isn't a footnote - it's the difference between a deployment that runs cleanly and one that requires thermal redesign.
We specify this publicly because we think buyers should demand this level of transparency from every supplier. If a vendor can't tell you which DSP is inside their 1.6T module, that itself is the answer to your qualification question.
Treating transceivers as critical infrastructure changes the qualification conversation in concrete ways:
The private-label model - where the same part number can contain different components across different production runs - is structurally incompatible with critical infrastructure requirements. You cannot audit a label. You can audit a manufacturer.
The neoclouds and hyperscalers who get this right will have a supply chain advantage that compounds over time. The ones who don't will find out the hard way - usually at the worst possible moment.
The AI infrastructure build-out has turned optical transceivers from a background procurement line item into a strategic supply chain decision. The companies that recognise this shift early - that start asking harder questions of their transceiver suppliers now - are the ones that will build AI factories that perform, scale, and comply.
Know your silicon. Know your manufacturer. Know your supply chain. At 1.6T, everything else follows from that.
The word 'manufacturer' is doing a lot of heavy lifting in the transceiver industry. Most buyers have no idea what it actually means - and the industry has made that confusion very profitable.
Ask almost any transceiver vendor if they're a manufacturer. Almost all of them will say yes. Some will say it proudly. Some will put it on their website in large font. A few will point to a factory they've visited, or a production line they've seen photos of, or a contract manufacturer somewhere in their supply chain that technically does make something they sell.
The word has been stretched so far it's nearly meaningless. And for buyers of high-speed optical interconnect running AI infrastructure, that ambiguity is no longer a minor nuance. It's a procurement risk with real consequences.
To be precise about what this distinction means:
A private label buys a finished or near-finished product from a third-party manufacturer, applies their own branding, and sells it on. They may test it. They may repackage it. They may have a support team and a sales force and a professional-looking datasheet. But they did not design it, they do not control what's inside it, and they cannot guarantee what will be inside the next shipment.
A genuine manufacturer owns the design. They specify the bill of materials - which DSP, which laser source, which driver IC, which TIA. They control the production process. They define the testing protocol and own the test data. When something changes in the component supply chain, they make an engineering decision about it - they don't just accept whatever their supplier sends.
This distinction matters enormously at 1.6T. The DSP inside a 1.6T transceiver is not a commodity interchangeable part. It determines the power consumption profile, the signal integrity characteristics, the thermal behaviour under sustained load, and the long-term reliability curve. A private label cannot tell you which DSP will be in your next order because they don't control that decision. A manufacturer can - because they made it.
At ATOP, our 1.6T OSFP is built around the Marvell Ara 3nm PAM4 DSP. That's not a footnote. That's a commitment. Same silicon, every unit, every shipment, auditable and documented. That's what manufacturing means.
The private-label model isn't inherently fraudulent. But it is structurally incapable of the consistency, traceability, and accountability that critical AI infrastructure requires. The problem isn't bad intentions. It's a model that was never designed for this level of scrutiny.
You don't need a factory tour to tell the difference. You need three questions:
First: which DSP is in your 1.6T module, and can you show me the component specification? A manufacturer answers immediately. A private label hedges, deflects, or gives you a marketing answer.
Second: if I order the same part number in six months, will the internal components be identical? A manufacturer says yes and can explain why. A private label cannot guarantee it.
Third: can you provide documented country of origin for the core components - not the assembly location, the components? A manufacturer has this. A private label frequently does not - and may claim a COO that is factually impossible.
Three questions. The answers tell you everything.
The industry has a private-label problem. It also has a deeper problem it talks about even less: manufacturers who behave like private labels - shopping components on price, swapping silicon between runs, and calling it quality control.
The distinction between private labels and manufacturers is not sufficient. A more critical line needs to be drawn - because that first distinction, while important, isn’t enough.
The uncomfortable truth is that holding a design, running a production line, and calling yourself a manufacturer does not automatically mean your products are consistent, traceable, or qualified for the infrastructure they're being sold into. There is a wide spectrum between 'genuine manufacturer' and 'private label,' and a significant part of the transceiver industry occupies the uncomfortable middle ground.
Here is what that looks like in practice: a company that designs a transceiver around a specific DSP, but sources that DSP - and every other critical component - based primarily on price and lead time. When the preferred supplier has a 40-week lead time, they find another. When a different laser chip is cheaper and available now, they use it. The design stays the same on paper. What gets built does not.
This isn't a fringe practice. It's widespread. The pressure on transceiver manufacturers - particularly at the tier below the top handful of established names - to hit price points and availability windows is enormous. And the path of least resistance is component substitution: find a part that fits the footprint, passes a bench test, and ships in time.
The problem is that 'passes a bench test' and 'performs reliably in a high-density GPU cluster for three years' are not the same requirement. A bench test validates basic functionality. It does not validate thermal behaviour under sustained load in a tightly packed chassis. It does not validate long-term degradation curves. It does not validate compatibility with the specific OEM switching silicon your customer is running.
And it absolutely does not tell you what will be in the next production run.
The buyer thinks they're getting a manufactured product with a consistent, qualified BOM. What they're actually getting is whatever combination of available components hit the price point that week. The datasheet doesn't change. The product does.
Identifying this behaviour is harder than identifying a private label, because the company genuinely does manufacture - they have the factory, the engineers, the test equipment. The question isn't whether they make it. It's whether what they make is the same thing every time.
The questions that expose the grey zone:
Can you show me your BOM for the last four production runs of this part number? Are the critical components - DSP, laser, driver, TIA - identical across all four? If there were any substitutions, what engineering validation was performed before the substitution was approved?
Can you provide traceability documentation linking a specific unit in my infrastructure to a specific production run and component batch?
What is your process when a preferred component becomes unavailable - and who has authority to approve a substitution?
Most manufacturers in the grey zone cannot answer these questions cleanly. The substitution decisions are made informally, documented poorly or not at all, and driven by supply chain pressure rather than engineering rigour.
At ATOP, our 1.6T OSFP has a fixed BOM. The Marvell Ara 3nm PAM4 DSP is not a preference - it is a specification. It is in every unit. It will be in every future unit unless a formal engineering change process - with full requalification - is completed and documented. That process exists in writing. It has sign-off requirements. It produces an audit trail.
This is what separates a manufacturer that takes quality seriously from one that uses quality language while making availability-driven decisions. The difference isn't visible in the datasheet. It's visible in the process documentation, the BOM traceability records, and the willingness to answer hard questions directly.
For neoclouds building AI infrastructure that needs to run reliably for years, at scale, under audit - the grey zone is not acceptable. Not because the components that get substituted in are necessarily bad. But because you have no way of knowing. And in critical infrastructure, 'probably fine' is not a quality standard.
The AI infrastructure build-out is forcing a reckoning with decades of loose practices in the transceiver industry. Private labels masquerading as manufacturers. Manufacturers behaving like brokers. COO claims that don't survive basic scrutiny. Component substitutions made for commercial reasons and documented as engineering decisions.
The neoclouds and hyperscalers who get ahead of this - who start demanding BOM traceability, fixed component specifications, and honest COO documentation now - will build supply chains that hold up under the scrutiny that's coming. The ones who don't will find out what's actually in their infrastructure at the worst possible time.
The bar is not complicated. It's just rarely enforced. It should be.
This paper has examined private labels, grey-zone manufacturers, and component substitution. There's one more distinction the industry avoids making openly - and it's the one that separates infrastructure-grade manufacturing from everything else.
The previous sections have drawn increasingly sharp lines. Between private labels and manufacturers. Between manufacturers who hold a fixed BOM and those who shop components on price and lead time. Each line matters. But there's a final line that sits above all of them - and very few companies in this industry can stand on the right side of it.
OEM manufacturing. Not the label. The reality.
The term gets used loosely - like 'manufacturer,' it has been stretched to cover a wide range of actual practices. To be specific about what OEM-grade manufacturing actually requires, and why so few of the smaller players in the transceiver industry can genuinely claim it.
OEM status isn't self-declared. It's earned through approval processes run by the companies whose infrastructure your products go into - Cisco, NVIDIA, and others who set qualification bars that the majority of the market cannot clear.
Start with the factory. ATOP operates the first fully automated Industry 4.0 manufacturing facility in the transceiver industry. Not partially automated. Not automated in the headline processes with manual steps elsewhere in the line. Fully automated - end to end, digitally integrated, with real-time process control and traceability at every stage of production.
This matters for a reason most buyers don't think about: human hands in a production process are a source of variance. Variance in assembly torque. Variance in cleaning. Variance in alignment. Variance in test handling. A fully automated line eliminates that variance. What comes off the line on day one is identical to what comes off the line on day five hundred.
Layer on ISO 9001 and ISO 14001 certification - not as paperwork exercises but as the structural framework that governs process control, change management, and continuous improvement across the entire operation. And 100% automated optical testing on every unit before it ships. Not batch testing. Not statistical sampling. Every single unit.
<0.02%
100% automated optical testing, every unit
1–3%
Batch testing, statistical sampling
That gap - less than 0.02% versus 1 to 3% - is not a marginal improvement. It is a difference of two orders of magnitude. In a deployment of 10,000 transceiver ports, that's the difference between fewer than 2 failures and up to 300. In a GPU cluster where each failed link degrades training runs and triggers engineering intervention, that number has a direct dollar value.
Behind the factory floor sits something less visible but equally important: a serious, sustained commitment to research and development. OEM status is not just about how well you manufacture today's product. It's about whether you have the engineering depth to design tomorrow's - and to continuously improve the processes, materials, and component choices that determine quality at every node.
ATOP runs a significant in-house R&D team. Not a handful of engineers maintaining existing designs - a dedicated organisation working on next-generation product development, component evaluation, process optimisation, and qualification testing. This is the engine that keeps an OEM manufacturer ahead of the technology curve rather than chasing it.
For context: a private label has no R&D team because they have no product to develop. A grey-zone manufacturer may have a small engineering group keeping existing designs current. An OEM-grade manufacturer invests in R&D at a scale that most smaller players in this industry simply cannot sustain - because the cost of staying at the frontier of optical technology, process automation, and qualification standards is high, ongoing, and non-negotiable.
R&D is also what separates a manufacturer that responds to technology transitions from one that leads them. At 400G, 800G, and now 1.6T, the companies that were ready at launch - with qualified, tested, OEM-approved products - were the ones with the engineering investment to see it coming. The ones scrambling to qualify are the ones who weren't.
OEM qualification requires investment that the majority of smaller transceiver manufacturers have not made and cannot quickly make. The Industry 4.0 automation infrastructure alone represents a capital commitment that takes years and significant engineering depth to build correctly. The ISO certification framework requires sustained operational discipline. The 100% optical test requirement demands test capacity that scales with production volume. The R&D organisation requires a pipeline of engineering talent and the long-term investment horizon to build and retain it.
And OEM approval from Cisco, NVIDIA, and peers requires demonstrating all of the above to organisations whose qualification processes are rigorous, time-consuming, and unforgiving. You don't get approved by writing a good datasheet. You get approved by opening your factory, your process documentation, your R&D records, and your quality data to scrutiny - and passing.
The result is that OEM-grade manufacturing in the transceiver industry is genuinely rare. Not rare because the standards are secret. Rare because meeting them is hard, expensive, and takes years of sustained commitment to build and maintain.
Most buyers have never asked their transceiver supplier whether they hold OEM qualification - because it never seemed like a necessary question. At 1.6T, running AI infrastructure at scale, under NIS2 audit obligations, it is the first question. Everything else follows from the answer.
This paper began by outlining the invisible layer - the optical interconnect that makes the AI factory run. It has examined the realities behind that layer: the supply constraints, the persistence of commodity buying habits, the risks of private-label sourcing, the limitations of grey-zone manufacturers, and the distinction between holding a design and holding a standard.
The thread running through all of it is consistent: at the speed and scale that AI infrastructure now demands, the component choices, manufacturing standards, R&D depth, and supply chain accountability of a transceiver supplier are no longer background considerations. They are strategic decisions with direct consequences for performance, reliability, compliance, and cost.
OEM-grade manufacturing is the standard that infrastructure of this importance requires. It exists. It is achievable. And it is rare enough that identifying who genuinely meets that standard - and who does not - is one of the most important due diligence steps a neocloud procurement team can take.
ATOP operates to that standard. Its manufacturing processes, component control, and R&D investment are built to support the performance, consistency, and transparency that AI infrastructure demands - today and at the next generation of scale.