In this figure, there’s one point in the processing timeline that the model predicted a big jump in the probability of a bad wafer — a ‘badness’ prediction, in short. That analysis means two things. The first is that timestamp is relevant to that defect in some way, and the second is that the wafer was predicted to be bad months before the final measurement. “So you could stop the process if you knew this fact,” said Ide.
Their scoring algorithm, called the Trajectory Shapley Value, a novel extension of the well-known Shapley value algorithm in game theory, is meant to give engineers some priority recommendations. With this particular model, Ide and his colleagues don’t try to guess what exactly is going on with the machines, they just identify the time when something goes wrong.
In a second paper, they used a different type of model. Instead of a good-bad classification, this trajectory-based prediction model predicts actual defect density and tries to identify which process is most responsible.
Again, responsibility scores are calculated, but this time they come from process attributes — wafer quality data collected throughout the manufacturing process. But how do you convert process attributes into numbers that you can analyze? For this they propose a technology called proc2vec, an approach inspired by word2vec, a well-known technique in natural language processing. Much like a transformer automatically analyzes the interdependency between words without being given explicit grammar knowledge, proc2vec is meant to automatically capture hidden dependencies among silicon wafer processes and inline measurements.
For example, using wafer history data from IBM Research Albany, the team demonstrated that incorporating these interdependencies significantly enhances defect prediction accuracy. Their new attribution method, built on the trajectory-based model, successfully identified potentially anomalous processes caused by unusually long waiting times.
The team’s third paper takes aim at WIP (work in-progress) bubbles in the fab, which is rather like traffic congestion of wafer lots, or groups of wafers going through the build process at the same time. Wafer lots move around a fab’s railway at different rates, so there can be a surprising amount of randomness.
To understand how wafer traffic gets jammed up, fabs use an advanced semiconductor manufacturing simulator (ASMS), like a traffic simulator for city planning. Running the ASMS is computationally demanding, so a simplified model has been used for many years, called queuing theory. Queuing theory may use assumptions that are too simple, though, which can underestimate how truly variable a fab can be, according to Ide.
Using an alternative mathematical model called the Hawkes process, which accounts for event history, the team analyzed data from IBM Research Albany’s wafer history. They found this approach, evaluated using a statistical model selection criterion called the Akaike Information Criterion (AIC), provides a much better fit between predicted lot arrival times and the actual times compared to queuing theory assumptions.
A measure they used is called ‘X-factor,’ the ratio between actual cycle time and ideal cycle time, which would be the shortest processing time assuming only one wafer exists in the entire fab — no waiting time, just moving and tooling. “So typically X-factor is much more than 1, typically 10 or 15,” said Ide.
They found that non-uniformity in lot arrival times at certain tool positions dramatically increased the time it took to complete the wafer. Following their model, it turns out that if average tool utilization is controlled, X-factor can be much bigger than the traditional queuing-based model would predict.
This suggests traditional queuing theory-based WIP analysis need to be revised, at least for semiconductor manufacturing. This paper points out the problem with the existing approach, but it doesn’t necessarily propose a solution. It does, however, suggest that the Hawkes model is better.
Much of this work is in its early stages. Along the way, the team identified a major limitation: that we only have superficial information about process parameters. To address that, the IBM Research team plans to incorporate physics-based information in their future work.
Long term, their goal to apply these learnings and models to real production lines to improve the quality of wafer fabrication runs in the future.