What’s The Best Advanced Packaging Option?
A dizzying array of choices and options pave the way for the next phase of scaling.
By: Mark Lapedus, Executive Editor for Manufacturing (at Semiconductor Engineering)
As traditional chip designs become more unwieldy and expensive at each node, many IC vendors are exploring or pursuing alternative approaches using advanced packaging.
The problem is there are too many advanced packaging options on the table already, and the list continues to grow. Moreover, each option has several tradeoffs and challenges, and all of them are still relatively expensive.
Advanced packaging has been around for decades. Assembling different and advanced dies in a package is one way to advance a chip design. Today, this concept is sometimes referred to as heterogenous integration. Nonetheless, advanced packaging is mainly used for higher-end, niche-oriented applications due to cost.
That may change soon. IC scaling, the traditional way of advancing a design, shrinks different chip functions at each node and packs them onto a monolithic die. But IC scaling is becoming too expensive for many and the benefits are diminishing at each node.
While scaling remains an option for new designs, the industry is searching for alternatives, including advanced packaging. What’s changed is the industry is developing new advanced package types or expanding the existing technologies.
The motivation behind advanced packaging remains the same. Instead of cramming all chip functions on the same die, the idea is to break the pieces up and integrate them in a package. This supposedly reduces the cost and provides better yields. Another goal is to bring the chips closer to each other. Many advanced packages bring the memory closer to the processor, enabling faster access to the data with lower latencies.
This all sounds simple, but there are several challenges here. Plus, there is no one package type that fits all needs. In fact, chip customers face a dizzying array of choices. Among them:
• Fan-out: Integrated dies and components in a wafer-level package.
• 2.5D/3D: Chips are placed side-by-side or on top of each other in a package.
• 3D-ICs: Stacking memory on memory, memory on logic, or logic on logic.
The industry also is pursuing a concept called chiplets, which enables 2.5D/3D technologies. The idea is that you have a menu of modular chips, or chiplets, in a library. Then, you integrate them in a package and connect them using a die-to-die interconnect scheme.
There are other approaches as well. So what’s the best option? The answer depends on a number of different factors.
“There are numerous packaging schemes available today and more being developed to address the spectrum of requirements,” said Kim Arnold, executive director for the advanced packaging business unit at Brewer Science. “The overall drive is for improved performance, more integration, lower cost and high reliability. Individual requirements lead to the package selection.”
Going with fan-out
For years, chipmakers introduced a new logic process with more transistor density at each node. In a two-year cadence, device makers develop chips based on the process, enabling them to lower the cost per transistor.
The big change occurred at 22nm and 16nm/14nm, when chipmakers migrated from traditional planar to advanced finFET transistors. FinFETs have enabled the industry to scale devices to 10nm/7nm, with 5nm in R&D.
“FinFET scaling reduces lateral dimensions to increase device density per unit area while increasing fin height as a way to improve device performance,” said Nerissa Draeger, director of university engagements at Lam Research, in a blog.
Now, chipmakers are working on 3nm. At each node, though, process R&D and design costs are skyrocketing. Plus, classical scaling is slowing down. “The classic 2D scaling, no question, has run out of gas, but there are all of these new opportunities in structure innovation, materials innovation, architectures, packages, in addition to 2D,” said Gary Dickerson, chief executive at Applied Materials, in a recent presentation.
While some will move to the next nodes, the costs are astronomical. That’s why many are taking a harder look at advanced packaging.
IC packaging was once a straightforward process. After a wafer is processed in a fab, the chips are diced up and then assembled into various package types.
Several years ago, the industry introduced a technology called wafer-level packaging (WLP). Unlike conventional packages, which can take up board space, WLP enables smaller packages with more I/Os.
“You don’t have to limit it to a single die with fan-out. You can do both heterogenous and homogeneous integration, where you split your dies up and combine them in a fan-out package. You use the advantages of electrical connectivity in fan-out to interconnect different dies,” said John Hunt, senior director of engineering at ASE, in a recent presentation. “You don’t have to limit it to silicon die. You can integrate MEMS, filters and passives.”
Nonetheless, fan-in and fan-out are different. One distinction is how the two package types incorporate the redistribution layers (RDLs). RDLs are the copper metal interconnects or traces that electrically connect one part of the package to another.
In fan-in, the RDL traces are routed inwards, which limits the I/O count. In fan-out, the RDLs are routed inward and outward, enabling more I/Os.
The original fan-out technology is called embedded wafer-level ball-grid array (eWLB). Today, Amkor, ASE, JCET and others sell eWLB packages. Targeted for cell phones and other products, eWLB is a standard-density product with less than 500 I/Os.
Amkor, ASE, TSMC and others sell high-density fan-out, which have more than 500 I/Os. These packages are used in automotive, servers and smartphones.
Going forward, fan-out is expanding into new forms. Among them are:
• Fan-out system-in-package (SiP): A SiP is a multi-die package that performs a specific function. A fan-out SIP may incorporate dies and passives.
• Fan-out with high bandwidth memory (HBM): Typically, HBM, a memory stack, is incorporated on more expensive 2.5D/3D packages.
• Panel-level fan-out: Some are developing fan-out on a large square format.
These and other fan-out packages are shipping, but there are some challenges. Generally, fan-out is more expensive than the legacy packages. It’s also a confusing market with various options.
There are three ways to make fan-out—chip-first/face-down; chip-first/face-up; and chip-last. In the chip-first/face-down flow, the dies are placed in a wafer-like structure, which is filled with an epoxy mold compound. The RDLs are formed within the wafer structure. The dies are cut, forming a chip housed in a package.
All fan-out technologies present some manufacturing challenges. “The challenges include fine-pitch copper resolution of less than 2µm, with an increased number of redistribution layers,” Brewer Science’s Arnold said. “With these trends come increased reliability challenges due to thermal mismatch, warpage, fine line/space interconnects, board-level solder reliability and multi-die integration of passive and active components.”
Then, when the dies are embedded in the wafer, they tend to move, causing an unwanted effect called die shift. This impacts the yield. “Die shift in fan-out can be partially compensated by the lithography process,” said Shankar Muthukrishnan, senior director of lithography marketing at Veeco. “This is expected to be a significant challenge, especially for multi-chip modules until a long-term solution can be developed to eliminate die shift.”
Besides fan-out, IC vendors also can incorporate chips in a 2.5D package. In 2.5D, dies are stacked or placed side-by-side on top of an interposer, which incorporates through-silicon vias (TSVs). The interposer acts as the bridge between the chips and a board, which provides more I/Os and bandwidth.
In one example, an FPGA and an HBM are placed side-by-side in a 2.5D package. An HBM is a DRAM memory stack. For example, Samsung’s HBM2 technology stacks eight 8Gbit DRAM dies on top of each other.
The dies are connected to one another, or to the interposer, using an interconnect technology called copper microbumps and pillars. Bumps and pillars provide small, fast electrical connections between the dies.
2.5D has some advantages, but it’s also expensive due in part to the cost of the interposer. That’s why 2.5D is limited to high-end applications.
But there is still a place for 2.5D. Some are developing new device architectures for machine learning and other apps, which require more I/Os and bandwidth.
For now, 2.5D is the only option here. Fan-out is closing the I/O gap, but it’s not there yet. In the future, 3D-ICs may fill the void.
Nonetheless, 2.5D can incorporate large die sizes. For example, an FPGA has a die size around 800mm². This is close to the maximum of a 1X reticle ﬁeld size, which is 835mm².
Some new device architectures, however, require 2.5D packages with interposers that exceed the maximum reticle field size. This requires a different fabrication process. For this, the interposer is split into two smaller pieces and processed on two reticles. Then, the two reticles are stitched together, which can be an expensive and difficult process.
Still, the industry is moving forward with these large packages. For example, TSMC is readying 2.5D with an interposer at a 1.5X reticle size. “We are quickly going over 1X,” said Douglas Yu, vice president of integrated interconnect and packaging at TSMC, at a recent event. “This year, it’s 2X. 3X is coming.”
Using three reticles, TSMC has demonstrated a technology with a massive 2,460mm² interposer area. It can incorporate two 600mm² SoCs and 8 HBM2 dies in a 75mm x 75mm package size.
Besides interposers, there are other options. Intel, for one, has developed a silicon bridge, which is an alternative to the interposer. Intel refers to its bridge as the Embedded Multi-die Interconnect Bridge (EMIB).
A bridge makes use of a tiny piece of silicon with routing layers that connects one chip to another in a package. “It takes a lot less silicon area (than an interposer),” said Babak Sabi, vice president and general manager of assembly test technology development at Intel, in a recent interview. “You can put as many bridges as you wish on a substrate. It doesn’t have any reticle size limitation like a silicon interposer.”
Intel has leveraged EMIB and other technologies for its new efforts in the 3D-IC arena. Using these technologies, Intel recently unveiled a new 3D CPU platform, which combines a 10nm processor core with four 22nm processor cores in a 3D package.
That’s just one possibility with the technology. “[This] approach gives our chip architects flexibility to mix and match IP blocks and process technologies with various memory and I/O elements in new device form factors,” Sabi said.
Today’s 2.5D/3D technologies have some scaling limitations, however. There are issues with the bumps/pillars and the tools.
In 2.5D/3D technologies, dies incorporate tiny bumps on one side. The bumps on each die are connected using thermal compression bonding (TCB). A TCB bonder uses force and heat to connect the bumps.
This is a slow process. “The bonding process has a low throughput and cannot overcome the challenge of scaling below a 40μm pitch,” said Guilian Gao, a distinguished engineer at Xperi, in a recent paper.
In fact, today’s most advanced microbumps and pillars are tiny structures with a 40μm pitch. A 40μm pitch involves a 25μm copper pillar in size with 15μm spacing.
Using today’s technologies, the industry can scale the bump pitch down to 20μm or 10μm. Then, the industry needs a new solution beyond bumps/pillars, namely a technology called copper hybrid bonding.
Intel, TSMC, UMC and others are working on copper hybrid bonding, which promises to scale the pitches in packages beyond 20µm.
Hybrid bonding follows a copper damascene process. Two wafers are processed in the fab. Tiny copper interconnnects are formed on one side of each wafer. Then, the two wafers are bonded.
In the bonding process, the tiny interconnects are heated up. “You heat them up, essentially creating an area or spot where the copper diffuses with each other,” explained Subodh Kulkarni, president and chief executive at CyberOptics. “If you get them hot enough and bring them into actual contact with each other, the copper atoms will go back and forth. It creates a perfect bond.”
Hybrid bonding enables a vendor to stack and connect devices directly using fine-pitch copper connections, eliminating the need for bumps and pillars. It paves the way towards more advanced forms of 2.5D, 3D-ICs and 3D DRAMs.
Hybrid bonding isn’t new. For years, CMOS image sensor vendors have used the technology. Now, the industry is working on hybrid bonding for advanced memory and logic die stacking.
Each vendor has a different strategy. Intel is developing 3D-ICs for its own product portfolio. In contrast, the foundries, such as TSMC and UMC, are developing hybrid bonding to enable new, advanced packages for outside customers.
Some are developing their own bonding technology, while others are licensing it from Xperi. Xperi’s hybrid bonding technology is called Direct Bond Interconnect (DBI).
“We have licensed technology IP from Xperi in order to accelerate our development in this area,” said Steven Liu, vice president of corporate marketing at UMC. “We think DBI is a potential technology for the ‘More than Moore’ era, whether through a wafer-to-wafer or die-to-wafer approach. UMC plans to offer DBI solutions to its customers and leverage our existing technology advantages gained from past experience.”
Copper hybrid bonding is conducted in a fab, not at an OSAT. In Xperi’s flow, metal pads are recessed on the wafer surface. The surface is planarized using chemical mechanical polishing (CMP), followed by a plasma activation step.
A separate wafer undergoes a similar process. The wafers are bonded using a dielectric-to-dielectric bond, followed by a metal-to-metal connection.
Meanwhile, TSMC is developing its own hybrid bonding technology. TSMC will use it to develop a 3D-IC technology called System on Integrated Chips (SoIC). SoIC is due out by the end of 2020.
SoIC paves the way towards integrating smaller chips with different process nodes in a package. “It’s like a single SOC,” TSMC’s Yu said. “This offers very close proximity between the integrated chips. That translates into an advantage in latency, bandwidth, power and form factor.”
TSMC’s technology is 11.9X faster with 191X more bandwidth than current 2.5D/3D devices. Initially, SoIC technology enables 9μm pitches. It enables I/O densities from 12,000/mm² to 1,200,000/mm², compared to 800/mm² for microbumps, according to TSMC.
The company recently demonstrated the SoIC concept for a fan-out package. In its current InFO package, a memory die is on top, while a system-on-a-chip (SoC) die is on the bottom.
In SoIC technology, though, the SoC is broken up into three smaller chips. One chip is on top and two are on the bottom, which are bonded. By breaking up the larger die into smaller pieces, TSMC said it can reduce cost and boost the yields.
In another example, TSMC showed a 3D device architecture with three tiers. The first tier consists of a large die. The second and third tiers each consist of three smaller chips, all embedded in a substrate.
There are some challenges here. Obtaining known good die is one issue. Aligning and stacking die accurately is another.
Hybrid bonding is also difficult. “This is not trivial,” TSMC’s Yu said. “The bonding is very critical. This bonding has a very limited thermal budget available. We need to have very good contact between the copper bonds from the two sides.”
There are other issues. “While the copper damascene process has been used for many years for BEOL metal interconnect in semiconductor fabs, there are several unique challenges for applying copper damascene to hybrid bonding – both for die-to-wafer and wafer-to-wafer,” said Stephen Hiebert, senior director of marketing at KLA.
“First, the wafer and die surfaces must be completely free of void-inducing defects. This is particularly challenging for die-to-wafer hybrid bonding because singulation is a major source for particle contamination,” Hiebert said. “Second, the CMP process must be precisely controlled such that the copper pad shape profile is maintained within the bonding process window. Third, the copper pads must be well aligned by the die-to-wafer or wafer-to-wafer bonding tools, which is increasingly difficult with smaller hybrid bonding interconnect pitches.”
Meanwhile, there are other options in the market. A group within the Open Domain-Specific Architecture (ODSA) project is defining and developing a new chiplet-based architecture. Achronix, Cisco, Facebook, Netronome, NXP, zGlue and others are working on this technology.
Developing 3D devices and chiplets present some major challenges, including one big issue. “For multiple devices combined in heterogeneous integration, one bad die results in the failure of the whole package,” Hiebert said.
Clearly, though, the industry is moving full speed ahead with heterogeneous integration. The good news is that there are several innovative ways to do this.
That’s also the problem. Finding the right solution is just one of many challenges in the arena.