Interpretability in Machine Learning

Since OpenAI released its large language model (LLM) chatbot, ChatGPT, machine learning, and artificial intelligence have entered mainstream discourse. The reaction has been a mix of skepticism, trepidation, and panic as the public comes to terms with how this technology will shape our future. Many fail to realize that machine learning already shapes the present, and many developers have been grappling with introducing this technology into products and services for years. Machine learning models are used to make increasingly important decisions – from aiding physicians in diagnosing serious health issues to making financial decisions for customers.

How it Works

I strongly dislike the term "artificial intelligence" because what the phrase describes is a mirage. There is no complex thought process at work – the model doesn't even understand the information it is processing. In a nutshell, OpenAI's model powering ChatGPT calculates the statistically most probable next word given the immediately surrounding context based on the enormous amount of information developers used to train the model.

A Model?

Let's say we compiled an accurate dataset containing the time it takes for an object to fall from specific heights:

Height Time
100 m 4.51 sec
200 m 6.39 sec
300 m 7.82 sec
400 m 9.03 sec
500 m 10.10 sec

What if we need to determine the time it takes for that object to fall from a distance we don't have data for? We build a model representing our data and either interpolate or extrapolate to find the answer:

{\displaystyle \ t=\ {\sqrt {\frac

Models for more complex calculations are often created with neural networks, mathematical systems that learn skills by analyzing vast amounts of data. A vast collection of nodes evaluate a specific function and pass the result to the next node. Simple neural networks can be expressed as mathematical functions, but as the number of variables and nodes increase, the model can become opaque to human comprehension.

The Interpretability Problem

Unfortunately, opening many complex models and providing a precise mathematical explanation for the decision is impossible. In other words, models often lack human interpretability and accountability. We often can't say, mathematically speaking, exactly how the network makes the distinction it does; we only know that its decisions align with those of a human. It doesn't require a keen imagination to see how this presents a problem in regulated, high-stakes decision-making.

Let's say John visits a lender and applies for a $37,000 small business loan. The lender needs to determine the probability that John will default on the loan, so they feed John's information into an algorithm, which computes a low score causing a denial. By law, the lender must provide John with a statement of the specific reasons for the denial. In this scenario, what do we tell John? Today, we can reverse engineer the model and provide a detailed answer, but even simple models of tomorrow will quickly test the limits of human understanding as computing resources become more powerful and less expensive. So how do we design accountable, transparent systems in the face of exponentially growing complexity?

Solutions?

Proponents of interpretable models suggest limiting the number of variables used in a model. The problem with this approach becomes apparent after considering how neural networks weigh variables. Models multiply results by coefficients that determine the relative importance of each variable or calculation before passing them to the next node. These coefficients and variables are often between 20 and 50 decimal places long, containing positive and negative numbers. While understanding the data underpinning a decision is essential, more is needed to truly elucidate a clear explanation. We can partially solve this problem by building tooling to abstract implementation details and provide a more intelligible overview of the model; however, this still only provides an approximation of the decision-making process.

Other thought leaders in machine learning argue that the most viable long-term solutions may not involve futile attempts to explain the model but should instead focus on auditing and regulating performance. Do large volumes of test data reveal statistical trends of bias? Does analyzing the training data show any gaps or irregularities that could result in harm? Unfortunately, this does not solve the issue in my hypothetical scenario above. I can't conclusively prove that my current decision was correct by pointing to past performance.

Technology is simply moving too rapidly to rely on regulations, which are, at best, a lagging remedy. We must pre-emptively work to build explainability into our models, but doing this in an understandable and actionable way will require rethinking our current AI architectures. We need forward-looking solutions that address bias at every stage of the development lifecycle with strong internal governance. Existing systems should undergo regular audits to ensure small changes haven't caused disparate impacts.

I can't help but feel very lucky to live in this transformative sliver of time, from the birth of the personal computer to the beginning of the internet age and the machine learning revolution. Today's developers and system architects have a massive responsibility to consider the impact of the technology they create. The future adoption of AI heavily depends on the trust we build in our systems today.

Data-Driven Espresso

I have a well-earned reputation as a "coffee snob" at work. Co-workers snicker as I don my jacket, preparing to walk eight blocks in subzero temperatures just for a better cup of coffee. After earning this reputation, I'm often asked about coffee, particularly espresso. When asked about options for making espresso at home, I usually respond with another question—do you want a new hobby?

Lately, I've tunneled deeply into the bottomless rabbit hole of coffee. As is my nature, I've taken an intensely data-driven approach to experimenting with flavor and maintaining consistency. Tightly controlling variables and changing one at a time is the only meaningful way to judge the outcome of a change. But, of course, this requires extreme precision, which is where equipment and technique come into play.

Most espresso machines, even those at the high end, fail to provide feedback about the brewing process. Defects manifest themselves clearly through tasting, but the ultimate cause is often unclear. This lack of transparency is frustrating for a person with a deeply analytical personality. Luckily, data-driven coffee nerds now have options.

A monumentally modest company named Decent has become an industry leader in the art of brewing espresso with extreme precision only afforded by an automated, software-driven design. Every variable can be controlled and dissected, from pressure to flow, weight, temperature, and time.

The DE-1 after brewing the second best espresso I’ve ever had.

Decent Espresso Machine

The Decent espresso machine is a game-changer. The machine offers an unprecedented level of control and precision that is unmatched by other espresso makers. This level of precision allows for a level of consistency that is unparalleled. There is no better option for technophile coffee lovers looking to take their espresso brewing game to the next level.

The machine's software allows for an incredible level of customization. Users can create and save their own recipes and profiles, tailoring the brewing process to their exact preferences. The software also provides real-time feedback, making it easy to make adjustments throughout the extraction process.

One of the most impressive features of the Decent machine is its ability to track and display data about each shot. For example, below is a ten-second pre-infusion followed by a standard nine-bar pressure profile compared with a pre-infusion followed by a long "bloom" phase that reduces astringency and bitterness.

The traditional flat nine-bar pressure profile has become the industry standard not because it offers the best extraction, but because it is a good compromise between quality and time—an essential consideration for a busy cafe. Despite decades of incremental improvement, applying modern technology to a century-old brewing process demonstrates that no system, no matter how refined, can transcend the benefits of human creativity mixed with a pinch of technology.

Increase Efficiency with Platform Cache

Platform Cache is a memory layer that stores your application's session and environment data for later access. Applications run faster because they store reusable data instead of retrieving it whenever needed. Note that Platform Cache is visible and mutable by default and should never be used as a database replacement. Developers should use cache only for static data that is either frequently needed or computationally expensive to acquire. Let's explore the use of cache in a simple Apex class.

In the example above, we acquire objects in the environment to create a schema. The Schema.getGlobalDescribe() function returns a map of all sObject names (keys) to sObject tokens (values) for the standard and custom objects defined in the environment in which we're executing the code. Unfortunately, we're not caching the data, which makes this an expensive process. This code consumes 1,307 ms of CPU time with a heap size of 80,000 bytes. Let's improve this code by using a cache partition.

This code performs the same operation but caches the result. In line 5, we're instantiating a cache partition. We're running the same function to build our schema map; however, line 15 instructs the program to place the results in the cache for later use. Our processing requirements diminished significantly, consuming only 20 ms of CPU time.

Despite the breathtaking advances in processing power, developers should always ensure they are writing efficient code that possesses a minimal processing footprint and scales with increased volume.

Further Reading

Salesforce Developer Guide - Platform Cache

The Paradox of Efficiency

It started earlier than I thought. In January, I wrote an article making predictions for 2023. One of my subheadings was “A Year of Doing More with Less,” where I argued that companies need to look for focused, strategic areas of investment to increase efficiency. We’re now seeing significant layoffs in the technology sector. Year to date, Google has laid off 12,000 workers, Microsoft 10,000 employees, and Salesforce 8,000. Unfortunately, these companies are taking a short-term view of efficiency that will damage their long-term success. Instead of finding areas where technologies can work together to provide multiplicative value, these CEOs are chasing short-term gains over long-term efficiency. I would argue that this quest for efficiency may decrease real efficiency.

Aggressive Headcount Reduction Limits Cross-Selling

Customer acquisition has its limits. Eventually, continued growth requires selling additional services to existing customers. Gathering revenue figures from sales is a trivial task, but it is challenging to pinpoint how much customer satisfaction with the service of existing products plays a role. The difficulty attributing hard figures to servicing makes these areas prime targets for headcount reduction. Why would a customer consider making another purchase when the business cannot provide support for products you’ve already bought? Platform lock-in has limits, and customers will eventually move to a competitor. Headcount reduction decisions are often made with the flawed assumption that all other variables will remain constant—productivity gains elsewhere will offset the smaller workforce. But this is seldom true unless the reduction is minimal.

The Inefficient Process of Gaining Efficiency

A consequence of chasing efficiency is its opportunity cost—its drain of resources that would have promoted real efficiency in the long term. Isn’t it curious that many companies most aggressively pursuing efficiency at all costs are often stuck making incremental improvements to existing technology? Why aren’t they most often responsible for radical, groundbreaking innovations? Why do comparatively small startups with different organizational values often make these genuine innovations? Companies with aggressive management directives to slash costs and reduce overhead often fail to invest in areas that produce innovation. In the long term, this lack of investment profoundly impacts company culture, often precipitating an exodus of forward-looking employees. Our industrial society values rapid and predictable returns on investment and neglects the necessarily inefficient process of innovation—shareholders see it as wasteful. This is the crux of the paradox; the quest for “friction-free” processes may be slowing the discovery of more fundamental changes that would have a much more profound impact on efficiency.

Our society views imagination with a strong sense of ambivalence. Humans are naturally short-term thinkers, and it takes an abundance of thoughtfulness to understand how a series of decisions made today will make a larger impact tomorrow.

Open-Source Has a Problem

Would you work for free? What about for "exposure?" Of course, your answer may vary greatly depending on context. I'll gladly volunteer my time to do architectural photography for a local nonprofit group trying to get a building on the National Historic Register, but I would feel less generous if my employer asked me to design a new customer management system in my spare time. The difference, of course, is intent—the nonprofit group is doing something to benefit all community members, present and future, while my employer is trying to maximize profits. What happens when this dynamic presents itself in the open-source software community?

The core-js library is the most widely used JavaScript library, providing hundreds of polyfill modules to extend the functionality of the standard JavaScript library significantly. While you may not have heard of it, many of the largest technology-forward companies, including LinkedIn, Netflix, Apple, eBay, Spotify, and many others, use the library. This website uses core-js. Neither I nor any of the companies mentioned above paid for core-js—it's an open-source library available for download on GitHub. You may not realize that this library is actively primarily maintained by a single developer. What if I told you that this developer only makes $57 per month on this library, despite its use by Fortune 500 companies collectively worth billions?

So, What's Next?

Last week, an intriguing commit appeared in the core-js GitHub repository containing a markdown file named 2023-02-14-so-whats-next.md. Looking at the title, I expected a development roadmap for the following year. Instead, I was greeted with an open letter to the community expressing feelings of frustration and despair. Maintenance requirements for this library are growing, along with entitled demands for additional features. It's tempting to write this off as a single cranky and overworked developer until you realize this scenario has played itself out dozens of times over the years. In 2013, Marak Squires, developer of the 'colors' and 'faker' libraries, purposefully corrupted the libraries in protest of "support[ing] Fortune 500s (and other smaller sized companies) with my free work." There are countless examples of similar scandals, pointing to a systemic problem in the open source community. Companies don't blink an eye at the cost of most enterprise software. Even moderately sized companies often spend over one million dollars yearly for a single piece of software, yet never consider a modest contribution to a free software stack crucial to their success.

It's About Respect

Financial pressures aren't the only problem. Maintaining good open-source code is difficult. Maintainers must wade through mountains of pull requests (if they're lucky enough to have contributors) and review the code for quality. Rejected pull requests are often a source of friction among project contributors, but most pressure comes from users. Maintainers are placed under extraordinary pressure and routinely barraged with rude complaints from angry users and pointed questions from companies when bugs appear. As it turns out, providing software free of charge and without warranty doesn't prevent people from feeling entitled when something goes wrong.

Modern software is mindbogglingly complex. Every developer today stands atop the shoulders of developers that came before them; it's nearly impossible to develop a piece of software to modern expectations entirely from scratch. I might write several thousand lines of code for an application, but I may have imported dependencies and frameworks containing several tens of thousands of lines of code that I don't have the time (and may not even have the skill) to rewrite. Libraries like core-js are essential pieces of digital infrastructure, and finding solutions to these systemic issues is crucial to maintaining resilient software.

User Adoption Strategies

A clear, shared strategic vision among management is essential in driving widespread platform adoption, but it alone is not enough to ensure success. Users need to feel as though they have a stake in the eventual success or failure of the platform, and the software must be perceived as an asset to the user's daily workflow. In short, the technology needs to be enticing and user-friendly enough to exude a slight gravitational pull, attracting users without a hard push from management.

The User Experience (UX) is a popular topic, and rightly so; failing to take adequate consideration of the user's overall experience is all too easy when solving problems from a purely technical perspective. However, consideration of the UX often does not take user perceptions and their willingness to change into account. The actual user experience and how the user perceives their experience often diverge for several reasons.

Application Rationalization

Increasing competitive pressure on price and service quality has forced many IT departments into large software rationalization initiatives, creating inventories of applications and looking for ways to systematically retire, refactor, or re-platform to reduce the organization's overall software footprint. This strategy effectively cuts costs while increasing efficiency, but it can also lead to user dissatisfaction. A Relationship Manager is far more likely to see a streamlined technology workflow as an inconvenience requiring familiarization that pulls them away from their core responsibility of managing customer relationships than the game-changing improvement that better positions the company for future growth as perceived by management and IT leadership. This cognitive disconnect creates friction between business and IT management, ultimately decreasing user engagement.

The Dangers of Overpromising and Under-Delivering

It's easy to be excited by a compelling software demo, but it won't necessarily represent what your finished product will look like. For example, a key feature shown in the demonstration may rely on a data integration that is not within the project's scope. Likewise, the user interface may assume a modern, feature-rich back end instead of the current legacy system. False expectations will cause users to perceive even the most successful rollout as a failure. Falling into this trap depletes your credibility among users, adding further difficulty to cultivating adoption.

Solutions

Iron out key details early using process maps to ensure your project starts with realistic baseline goals. Project goals should not only overlap with the company's strategic goals but also those of your users.

Temper the urge to rebuild existing processes during the planning stage. Instead, explore the business process with a fresh set of eyes and look for key areas of improvement. Including end-user representatives in the discovery process is key to providing insight into pain points. In addition, process improvements can simplify the build process while simultaneously including features that directly improve the quality of the user's workflow, aiding in adoption from the beginning.

Create a tight feedback loop with users immediately after rollout and continue listening. Prepare to address feedback quickly and openly. Few things exhaust users' goodwill faster than repeatedly failing to address feedback. Acquiring mass user adoption is always an uphill battle, but remaining cognizant of these common missteps in the early stages of your project will help provide a healthy foundation that the development team can build upon through user partnership.

Ivory

On January 12, 2023, Twitter revoked third-party access to their API without warning. As a developer, this is one of the most repulsive actions I've seen a social media company take in recent memory, and that's saying something. Deprecating an API is a process typically measured in months or years, giving developers time to create alternative solutions and shift their business model. Instead, these independent development shops found their apps (and primary sources of revenue) destroyed overnight. This situation is particularly egregious considering how integral a role these third-party apps have played in the improvement of the platform. Features we now take for granted, such as pull to refresh, were created by these developers—they were integral to improving the user experience (UX) of the platform. For many users, these apps were the face of Twitter.

Luckily, Tapbots, the small two-person developer team responsible for the iconic Twitter app named Tweetbot, had been working on a Mastodon app called Ivory. The unexpected destruction of Tweetbot accelerated the development of Ivory, and Tapbots decided to release an early access version of the app.

The rise of a Phoenix an Elephant from the Ashes

The pricing model of Ivory has proven to be divisive. Many prospective customers feel it is inappropriate to charge for what is essentially an incomplete app. Others find the $1.99 per month/$14.99 per year subscription fee excessive. While I typically feel annoyed at the prospect of paying for an unfinished product, I'm more than willing to give Tapbots a pass, given the circumstances. Software subscriptions are a reality of the current market. Customers are unwilling to pay large sums of money for software, and most app developers incur an ongoing cost per customer due to cloud sync features and data storage. It is economically unsustainable for a company to provide a lifetime of updates and support for no additional cost.

If I had to distill the interface of Ivory down to one word, it would be sophisticated. The minimalist interface possesses wonderful clarity, and the iconography lends a unique flair to the app. It follows the Apple Human Interface Guidelines while still feeling unique. With Ivory, simple doesn't mean primitive, and less doesn't feel sparse. The app displays remarkable attention to detail. The trumpet icon is movable, and flicking it toward a corner causes it to bounce into place. Tapping buttons activate subtle animations that make the app feel alive. A separate dark theme explicitly designed to take advantage of the pure blacks offered by OLED screens provides eye-popping contrast. Ivory performs as well as it looks. It has some of the smoothest scrolling animations I've seen on a third-party app and remains responsive when quickly navigating through the interface.

 
Ivory's beautiful iconography

Ivory’s beautiful iconography

 

As good as Ivory is, it could be better. There are several features I missed over the course of using the app. The ability to edit my posts and quote those of others would greatly augment the app's utility. Ivory's ability to sync your place in the timeline over iCloud is welcome, but the ability to sync other customizations, such as filters, would be welcome. Luckily, most of the features I want are on the official roadmap. Given that the developers at Tapbots have a long history of delivering flawlessly executed features, I have no doubt Ivory has a very bright future ahead.

A Pilgrimage in Technology

There is a fleeting moment just before the sun sets and the light fades, where the gentle glow grants unusual clarity to familiar subjects, where shadows are long, offering contrast and refuge from continuous shape and form. I'm a profoundly introspective person in normal times, and at the conclusion of another year, I often find myself completely lost in contemplation. Perhaps it is because I am approaching 40 and having an early-onset midlife crisis, but my thoughts this year have revolved around how I ended up in a technology career. Why have I been so fascinated with technology from a very young age? Was it inevitable all along, despite the circuitous route I took to get there? Is my effort actually creating value?

A Spark of Curiosity

I was a profoundly curious child. In hindsight, I'm surprised my parents were able to cling to their sanity despite my relentless onslaught of questions. No mechanical or electronic device was safe from my insatiable curiosity as I carefully disassembled them (and often failed to reassemble them correctly) in attempt to determine how they worked. Naturally, I was drawn to the IBM Personal System/2 my parents brought home with equal force to a planet crossing a black hole's event horizon. I knew this device was unique after watching my father turn it on. Unlike modern electronic devices, the ritual was a fascinating delight for the senses—the buzz of the CRT as it warmed, the satisfying click of the oversized power switch, the whirr of the spinning hard drive, and the cacophony of mechanical nose emanating from the floppy drive captured my curiosity like nothing else did. Over time, I learned how to traverse the filesystem with the OS/2 command line and run programs. That PC helped me expand my vocabulary and showed me the joys of writing, but most importantly, it showed me that the gap between imagination and reality was so much smaller than I realized.

A Perfect World

The world is messy, and there are ambiguities around every corner—an infinite palette of gray shrouds every decision, which is a challenging realization for a perfectionist. Which permutation of decisions ultimately leads to the optimal result, and how is success measured? Viewing your choices through this excessively critical lens makes every undertaking feel like an accumulation of defeats, even if you could rationally judge the result as successful. To revere a perfect ideal is to indulge in fantasy, which inevitably leads to disappointment.

Technology sidesteps this problem by creating a universe of binary rules and infinitely precise methods of judging outcomes. It is a perfect world, a paradise of blind equality, where success is flawlessly equitable, and chance plays no role in the outcome. I will receive the same result from the same input whether I'm wealthy or poor, male or female, atheist or devout. It's a comforting place for a weary traveler seeking refuge from the unpredictability and perceived inequity of life.

A Gateway to Possibilities

I've recently been reminded of the transformative power of technology on a very personal level. It began years ago with a faint, muffled ring in my ears. It wouldn't go away but was so soft that it was easily consumed by even minor, omnipresent background noise. I didn't think much of it until recently when the ringing began overtaking normal conversation. As a lifelong lover of classical music, I could no longer appreciate the complex, undulating contrapuntal movement from which the immense power of each melody is drawn. Only in my memory did the music possess the beauty I knew it had.

Technology considered advanced only a few years ago has been miniaturized to the point that it can fit in a hearing aid small enough to tuck behind the ear. Microprocessors have become so powerful that the signal processing-induced latency is less than 0.5 milliseconds, facilitating the seamless mixing of enhanced sound from the hearing aid and the audible remnants of the natural sound. Fractal tones help obscure the constant, sanity-warping tinnitus associated with hearing loss, and machine learning algorithms constantly adjust output to detect and separate speech from ambient noise.

I've built programs ranging in complexity from a basic Java-based marine biology simulation in high school to implementing an end-to-end commercial loan origination process complete with automated, ML-based decisioning. Yet, three decades later, I still feel the same wonder of that small boy peering into a dim CRT monitor, realizing that technology is a gateway to actualizing the possibilities of the human mind.

2023: An End to Creating Value with Imagination

We were promised jetpacks, flying cars, and human settlements on Mars. Instead, we were given TikTok, the metaverse, and an endless supply of cryptocurrency scams. The future of technology is notoriously difficult to predict and often moves in incomprehensible ways. Despite the loud objections of our analytical minds, the allure of attempting to look into the future is too strong. We inevitably succumb to our urge and confidently make (mostly inaccurate) predictions. Now that we’ve taken yet another trip around the sun, I’d like to indulge in this arbitrary and useless ritual. Below are three trends that I predict will take center stage in 2023.

A Deflating Bubble

This year will bring an end to profitless tech firms with fantastical valuations due to pandemic-fueled growth. The stock values of money-losing tech companies dropped sharply in the fourth quarter of 2022, down 57% and significantly underperforming market indices.1 Silicon Valley startups are coming to the realization that a good idea isn’t enough to create value. This trend will continue throughout 2023 as investors flock to more conservative companies with a track record of profitability. As a result, we will witness the beginning of a mass extinction event of technology startups.

A Year of Doing More with Less

Unclear geopolitical and economic indicators will cause leaders in the technology space to become more cautious. 2023 will be a sober year of targeted, strategic technological investment. Successful companies will find areas where innovations overlap and technologies work harmoniously to create new possibilities that provide multiplicative value. While unexciting, investing in foundational IT architecture will enable companies to take advantage of new and disruptive technologies quickly.

A Continued Exodus to the Cloud

Over the past few years, “the cloud” has gone from a trendy buzzword to an inescapable necessity for businesses. Even traditionally conservative and security-centric industries, such as banking, have completely shifted from a vertically integrated data supply chain powered by legacy mainframe architecture to an absolute embrace of cloud technologies. As a developer of software running on cloud services, I’ve had a front-row seat to this trend. While a cloud-based data strategy isn’t without some drawbacks, it is remarkably effective in controlling costs, scaling rapidly, and offering unparalleled resiliency.

References

1 Vlastelica, R. (2022, October 10). Tech earnings matter more than ever as the bubble deflates. Bloomberg.com. Retrieved January 13, 2023, from https://www.bloomberg.com/news/articles/2022-10-10/earnings-matter-more-than-ever-as-bubble-deflates-tech-watch

Scaling Silicon

The Apple M1 Ultra - Two M1 Max dies connected via an interposer

Apple raised eyebrows in 2020 when the company announced plans to transition from Intel processors to chips designed in-house, marking the end of a 15-year partnership with Intel.1 For long-time followers of technology, it was reminiscent of Steve Jobs' announcement at the 2005 Worldwide Developers Conference (WWDC), where he revealed Apple's plan to transition from PowerPC to the x86 architecture from Intel. Like the x86 transition fifteen years earlier, the rollout of Apple silicon went astonishingly smoothly despite the fundamental incompatibility between x86 and ARM instruction sets. For the first time in the recent past, Intel, Advanced Micro Devices (AMD), and Apple have taken divergent strategies in microarchitecture design. Each strategy has its own strengths and weaknesses, so it will be fascinating to see how well each approach scales to the future's cost, efficiency, and performance demands. AMD's chiplet design offers pricing advantages over Intel at the expense of bandwidth constraints and increased latency. Apple's system-on-a-chip (SoC) strategy requires larger dies but offers complete integration; however, we may be seeing the first cracks in Apple's ARM SoC strategy after scaling back plans for a high-end Mac Pro.2 According to Mark Gurman's reporting, a Mac Pro with an SoC larger than the M1 Ultra would likely have a starting cost of $10,000. To get a better perspective on the pricing challenges Apple may be facing when designing an SoC for the Mac Pro, let's explore how yield and cost change as die size increases.

For example, we can calculate the number of rectangular dies per circular wafer for Apple's basic M1 SoC and the M1 Max using basic geometry:

M1

Die Dimensions

10.9 mm x 10.9 mm

Die Size

118.81 mm2

Scribe Width

200 µm

Wafer Diameter

300 mm

Edge Loss

5.00 mm

Die Per Wafer

478


M1 Max

Die Dimensions

22 mm x 20 mm

Die Size

440 mm2

Scribe Width

200 µm

Wafer Diameter

300 mm

Edge Loss

5.00 mm

Die Per Wafer

117

The smaller M1 dies give us four times the quantity per wafer over the M1 Max. This is one factor influencing the cost of physical materials, but things get really interesting once we begin calculating yields. The process of fabricating working silicon wafers is delicate and rife with the opportunity to produce imperfections. Defects can be caused by contamination, design margin, process variation, photolithography errors, and various other factors. Yield is a quantitative measure of the quality of the semiconductor process and is one of the most important factors in wafer cost. The measure used for defect density is the number of defects per square centimeter. Assuming a standard defect density of 0.1/cm2 using a variable defect size yield model for TSMC's N5 node, our two wafers possess vastly different yields:3

M1

Die Dimensions

10.9 mm x 10.9 mm

Die Size

118.81 mm2

Scribe Width

200 µm

Wafer Diameter

300 mm

Edge Loss

5.00 mm

Die Per Wafer

478

Defect Density 3

0.1 cm2

Yield

88.9%

Good Dies Per Wafer

425

M1 Max

Die Dimensions

22 mm x 20 mm

Die Size

440 mm2

Scribe Width

200 µm

Wafer Diameter

300 mm

Edge Loss

5.00 mm

Die Per Wafer

117

Defect Density 3

0.1 cm2

Yield

65.4%

Good Dies Per Wafer

76

This yield disparity increasing from 4x to just over 5.5x further inflates our larger die's already higher manufacturing cost. Just how much of a price difference? Extrapolating data from the Center for Security and Emerging Technology, we can estimate that a 300 mm wafer created using TSMC's N5 node costs just under $17,000.4 Therefore, our M1 has a theoretical materials cost of $40 while our M1 Max has a cost of $223. Given that an M1 Ultra is two M1 Max dies connected via an interposer, the raw silicon cost of the Ultra is likely around $450. While all these figures are nothing more than conjecture, they clearly illustrate how quickly costs skyrocket and yields shrink as die size increases.

Where does this leave the Mac Pro and the Apple silicon roadmap? Cost-effective silicon capable of performance significantly higher than the current top-tier SoC will likely require a more advanced lithography process to decrease transistor size. A likely candidate is TSMC's N3 node, which is where Apple is headed over the subsequent few product cycles. However, the rate at which manufacturers are able to decrease transistor size is rapidly slowing, as evidenced in the chart below, so a more fundamental rethinking of chip manufacturing is on the horizon.

TSMC. (2021, June). 3nm Technology. https://www.tsmc.com/english/dedicatedFoundry/technology/logic/l_3nm

One certainty is that we are entering an exciting period of technological advancement that is beginning to disrupt the market. The ability of technology companies to adapt quickly is shifting from a mere competitive advantage to a requirement for survival. The future belongs to those who dare to think without boundaries.

References:

1 Gurman, M., & King, I. (2020, June 22). Apple-made computer chips coming to Mac, in split from Intel. Bloomberg.com. Retrieved December 26, 2022, from https://www.bloomberg.com/news/articles/2020-06-22/apple-made-computer-chips-are-coming-to-macs-in-split-from-intel?sref=9hGJlFio

2 Gurman, M. (2022, December 18). Apple scales back high-end Mac Pro plans, weighs production move to Asia. Bloomberg.com. Retrieved December 30, 2022, from https://www.bloomberg.com/news/newsletters/2022-12-18/when-will-apple-aapl-release-the-apple-silicon-mac-pro-with-m2-ultra-chip-lbthco9u

3 Cutress, I. (2020, August 25). ‘Better yield on 5nm than 7nm’: TSMC update on defect rates for N5. AnandTech. https://www.anandtech.com/show/16028/better-yield-on-5nm-than-7nm-tsmc-update-on-defect-rates-for-n5

4 Kahn, S., & Mann, A. (2022, June 13). AI chips: what they are and why They Matter. Center for Security and Emerging Technology. Retrieved December 30, 2022, from https://cset.georgetown.edu/publication/ai-chips-what-they-are-and-why-they-matter

Further Reading:

Agrawal, V. D. (1994). A tale of two designs: the cheapest and the most economic. Journal of Electronic Testing, 5(2–3), 131–135. https://doi.org/10.1007/bf00972074