M1 Pro MacBooks: Building a Tech Foundation

admin
Philip Leonard 14 Dec, 2021 28 - 11 min read
Share on facebook
Share on twitter
Share on linkedin

On June 22, 2020, CEO Tim Cook announced Apple’s two-year transition plan to their own silicon in his WWDC keynote speech. This marked the 3rd time in Apple’s history that the tech giant had transitioned to using a new CPU architecture. For tech giants and scale-ups alike, ambitious strategic tech moves bring with them a buzz of excitement. The Picnic iOS team even organises viewing parties to watch along every year!

Apple’s transition to their arm64-based M1 System-on-Chip (SoC) in all Mac products sets the direction for not only what software runs on, but what it’s built on. The JetBrains and StackOverflow developer surveys show that Windows still remains the leader in terms of development hardware; and whilst the share has remained fairly stable over the last 4 years, with 44–49% for MacOS between 2017 and 2021 according to JetBrains, a large number of software developers choose to use Apple hardware¹.

The transition from intel x86 to Apple M1 arm64 directs how we build software on this platform. So what does that mean for the developer experience at Picnic? Before we can build (CI), test (QA), deploy (CD), and maintain our software in production, the first step in the DevOps cycle is of course to design and build software on our very own machines.

What is arm64, and why is it better?

There’s no doubt that the folks at Apple are great at marketing, but what exactly are they selling? Why is there so much hype around this being the next must-have machine for recreational and professional users alike?

ARM stands for Advanced RISC Machine. Ahh, I do love an embedded acronym. RISC in turn stands for Reduced Instruction Set Computer. Fundamentally, the arm64 CPU architecture uses a smaller CPU instruction set, where principally each instruction performs one function. To put this into perspective, x86 arch (Complex Instruction Set Computing) has around 981 instructions compared to around 50 for arm64. This makes the architecture suitable for lower power embedded systems. See here for a more in-depth explanation of the differences between the two architectures.

This isn’t Apple’s first foray into the ARM world. Ever since the original iPhone shipped with an arm11 chip, Apple has used variants of ARM’s CPU architectures to power their mobile devices, from being the first to adopt a 64-bit armv8-a architecture in a smartphone in 2013, right up to the latest arm64 based A-series chipsets in the newest iPhones and iPads.

But what does that now mean for something like a laptop, where performance requirements are greater? Well, M1 chips aren’t just the architecture alone. Actually, the M1 isn’t even a CPU! With their latest silicon design, Apple has developed an entire SoC, or System-on-Chip, including GPU, Neural Engine, Digital Signal Processor, CPU, and more. In essence, they’ve adopted a heterogeneous computing approach with specialised hardware-accelerated chips. They’re also following the trend of the hybrid CPUs of Firestorm and Icestorm cores dedicated to different workload intensities for energy efficiency, with the Firestorm cores being accompanied by an unusually large amount of L1 and L2 cache. On top of this, there are other nifty design features, like Unified Memory Architecture, that are able to squeeze even more out of this CPU architecture that is otherwise intended for mobile and embedded systems.

At the “core” of it, however, and what’s contributing the most to this latest performance boost, is the smaller length of arm64 instructions coupled with an array of super-fast instruction buffers. For comparison, x86 instructions are variable in length, and anywhere between 1 and 15 bytes long. For arm64 this is fixed at 4 bytes. Besides the length, M1s are able to fill instruction buffers up much faster and more trivially than on x86 processors, which need to cut up longer instructions into micro-ops using dedicated hardware decoders. For arm64, a smaller instruction set may mean more instructions per operation, but when it’s blazing fast to fill instruction buffers up because of their size, it’s a win-win for these workloads! Here is a superb article with a much more detailed explanation.

Building Picnic Software on M1s

But hold on, why did I just ramble through CPU architectures? I’m a software engineer, you say, give me the hardware and let me build software quickly. Well, there’s an analogy under all of this. Very simply, CPUs and SoCs are platforms for running code. In modern-day tech companies, Platform teams help engineers in product teams to build, ship, and deploy their code. Just like an optimised M1 Pro SoC with its specialised hardware accelerators, at Picnic, we’re striving to build a tech foundation comprising of specialised platform teams that can be accelerators of the self-service approach for the various DevOps workloads and challenges that product teams face.

Ultimately, for software engineering, we’re focused on a couple of KPIs when considering hardware and OS choices:

  • Build times: how fast can I build and test my changes?
  • Tooling support coverage: do I have access to the best tools for the job?

This very blog is written on a 2020 M1 MacBook. While we await the delivery of the M1 Pros to Picnic’s new joiners with great anticipation, we ran through a compatibility and performance check using:

  • Test: 2020 M1 (8 Core — 4 ice & 4 fire) 13 inch model with 16Gb of RAM
  • Control: 2019 Intel Core i7 (12 core) MacBook 16 inch Pro 64Gb RAM²

I started off with a compatibility check and to get a sense of how we can boost our developer experience and cycle time with the most fundamental of resources: hardware.

Compatibility

A run-through of Is Apple Silicon Ready? reveals the most glaring M1 software compatibility issues. In principle, most should be covered by Rosetta 2, Apple’s dynamic binary translator for x86 to arm64 instructions, allowing you to run x86 compiled binaries on the latest Apple arm64-based silicon.

Apple has now worked on two versions of their Rosetta dynamic binary translator. Their first was to enable running PowerPC (another RISC CPU architecture from IBM) on Intel-based Macs back in 2006 for Apple’s second CPU architecture transition.

Development IDEs

IDEs and development tooling: IntelliJ, Xcode, Visual Studio Code, Android Studio, Slack, and Postman, all have native M1 binaries and the performance boost is very welcome!

Coding: ✅

Java

AdoptOpenJDK & OracleJDK worked great on a sample of our backend projects. There was a small compatibility issue with a LocalStack container, remedied by a simple upgrade to when they began distributing arm64 arch Docker images. Similarly, for Android, there was one small compatibility issue with our room dependency on a native SQLite distribution, also remedied by a simple upgrade.

Because Java 17.0.1 Java SE Oracle JDK now ships arm64 binaries, and as Picnic is a Java 17 company, this is a match made in heaven 🤝.

GraalVM

At Picnic, we use GraalVM in one of our projects. The project has been upgraded to adopt Java 17, but there’s still an ongoing effort to make GraalVM ready for arm64 builds. See the project status and open issue.

If you’re in the endless pursuit of speeding up Maven builds then you’ve probably heard of mvnd. Using mvnd with an x86 JDK is reportedly slower than using mvn with an arm64 JDK build. There’s an open issue for arm64 support when it comes to GraalVM.

Older Java versions

If your projects aren’t yet on Java 17, then you can also find arm64 MacOS Azul Zulu builds of OpenJDK that support Java versions all the way back to Java 8. If you want to work with the latest and greatest versions of Java then you should definitely consider applying to Picnic 😉

Building, ✅.

Docker

At Picnic, we use Docker (Desktop & Hub) for building, registering, and distributing containers for Kubernetes and for our local development environments. As a catch-all solution, Docker Desktop offers support for M1 chips that employs Rosetta 2 to emulate x86 Docker images. With arm64 images for key infrastructure like Mongo and Postgres available and support for building multi-arch images of our own services via our CI on our backlog³, we’re on a good path for evermore performant support when running our code locally!

Helm

At Picnic, we use Helm and Helmfile for configuration and package management for Kubernetes deployments. Apart from one small, and recently fixed, issue with helm-diff binaries the workflow of deploying and configuring application manifests was smooth sailing.

Running: ✅

Miscellaneous Findings

What? You want two monitors?!

Believe it or not, the 2020 M1 MacBook only supports connecting to one external monitor without the assistance of an external graphics card. Given I’ve built up somewhat of a reputation with my teammates for having an array of 3 external monitors at the office, this came as quite a surprise to me. It turns out this year’s models — which we’re rolling out for Picnic devs — support up to two monitors! The M1 Pro Max goes a step further with up to 4. Incremental improvements are better than none. I guess I will still have to wait…

Hey, hi, hello over there! I am an M1!

For the Java Gurus amongst us, a word to the wise: unless explicitly specified, out-of-the-box SDKMAN doesn’t detect if you are running arm64 (M1) hardware and won’t install the right architecture JDK binaries (because currently there are only a few who do distribute them). This can be accomplished by setting sdkman_rosetta2_compatible=false in ~/.sdkman/etc/config , after which, SDKMAN will show only arm64 JDK builds!

In general, double-check you aren’t running x86 binaries where you can benefit from big performance improvements running arm64 OpenJDK builds from projects like Zulu (for older Java versions like Java 8 for example) and Oracle (for 17.0.1+)). This extends to wherever you download your binaries from, and what mechanism it uses to infer your CPU arch (such as uname -m), so that you pick the right one! Even when installing manually, it’s an easy mistake to make by hitting the wrong download link; a curse often disguised as a blessing of Rosetta 2. A straightforward way to check for running applications is shown here by checking the Kind column in the Activity Monitor. Even I made the mistake for the JRE 🙈👇

Hot hot hot

Intel processors have been the scapegoat for countless thermal throttling issues with Macs over the years, but also in part due to Apple’s endless pursuit of uniform design and compactness that left ventilation and cooling in second place. ARM processors don’t suffer from quite the same fate. I can’t count the number of times my Intel-based Mac ground to a halt when compiling Picnic code on a hot summer’s day. What I can tell you is that you can actually use it comfortably as an actual laptop once again, even under intense loads, without burning your thighs. But don’t take my word for it, I’ll let the thermal (throttling) log when running the very same build on both machines speak for me:

M1:

❯ pmset -g thermlogNote: No thermal warning level has been recorded
Note: No performance warning level has been recorded
Note: No CPU power status has been recorded

Intel Mac (with an external laptop fan!):

❯ pmset -g thermlogNote: No thermal warning level has been recorded
Note: No performance warning level has been recorded
2021–12–08 21:08:10 +0100 CPU Power notify
CPU_Scheduler_Limit = 100
CPU_Available_CPUs = 12
CPU_Speed_Limit = 100
2021–12–08 21:08:47 +0100 CPU Power notify
CPU_Scheduler_Limit = 100
CPU_Available_CPUs = 12
CPU_Speed_Limit = 97
2021–12–08 21:08:52 +0100 CPU Power notify
CPU_Scheduler_Limit = 100
CPU_Available_CPUs = 12
CPU_Speed_Limit = 82
# And on and on and on and on…

I can comfortably write these very words whilst running full builds in the background with the laptop perched on my lap, thinking I wonder if they will put this XKCD in a museum one day?

Build Times

Notable tech companies like RedditShopify, and Twitter are rolling out M1 hardware (either partially or fully) to their engineering teams for some simple reasons: giving the fastest hardware to your engineers improves the developer experience and developer satisfaction, and makes software delivery faster than ever before.

A great rundown of why this is financially beneficial for a tech company, and discussions about the validity of these measurements, can be found courtesy of Reddit’s Jameson Williams here.

So, with that, let’s run a few tests of our own! Recall we are using a 2020 M1 (8 core — a song of 4 ice & 4 fire cores) MacBook Pro 13 inch 16Gb RAM test model, and a 2019 Intel Core i7 (12 core) MacBook 16 inch Pro 64Gb RAM as a control model4, a full build (with filled dependency caches but empty build caches) of a couple of our largest software projects looked like the following:

Our largest Java project, picnic-platform, built using Oracle Java 17.0.1 JDK arm64 and x86 binaries took:

  • Intel 16:10 min
  • M107:41 min (that’s a 52% improvement! 😍)

For the M1, using the arm64 JDK binaries is almost twice as fast as running x86 (Rosetta 2) JDK builds. The arm64 binaries really make the difference here.

For the Picnic Android App running a full test build using Azul Zulu JDK 11.0.13-zulu arm64 and x86 binaries on test and control devices respectively, predicated by a ./gradlew clean resulted in:

  • Intel 10:58 min
  • M1 6:28 min (that’s a 41% improvement! 🤩)

After doing some light napkin maths, I expect we can push the sub 4-minute mark for our largest Java build on the new M1 Pro models that have twice as many performance cores and three times faster memory bandwidth! I won’t do napkin maths for the savings; I’ll leave this inference down to the reader with assistance from Jameson Williams’ post linked above, but it’s very appealing. Plus, this is on the conservative side for Picnic, and I can’t wait to see further performance improvements on the new improved M1 models which we’re rolling out now! Finally, this leads to some interesting follow-up questions… If one step of the DevOps cycle is on arm64, why not all? Can we build, test, ship and run our software in production on arm64 processors sing services like AWS Graviton G6 instances? Watch this space!

How will Picnic build software in the future?

Companies like Uber set 40% platform capacity aside for their platform vs program split. This means dedicating a large portion of team capacity towards working on development, quality, security, and infrastructure tooling, that helps jet boost all engineering teams within the company, focusing on (re-)usability, consolidation, and scale.

Picnic’s strategy to develop a Tech Foundation began a few years ago by building a strong Java platform team. Since then Python, Infrastructure, and DevOps tooling teams emerged to facilitate all tech teams in building and deploying their software across a diverse tech stack. More recently, the latest platform teams cover Security, QA, Rule Engine, Analytics and more in order to tackle specific self-service aspects of delivering software end-to-end following the DevOps philosophy. In the coming years, we’ll apply an emphasis on expanding our reusable and scalable tech platform.

Conclusion

Hopefully, this post provides some helpful information and softens some of the surprises involved in transitioning development environments to the latest Apple hardware.

Moreover, if you’d like to make impactful contributions to a strong tech foundation, helping build the greatest software in the online grocery revolution, and using the latest and greatest lightning-fast M1 MacBook Pros, then apply to one of Picnic’s diverse range of great Tech positions including:

 
 
 
 
  1. We assume those that use macOS also use Apple hardware. Even though Hackintosh is, in theory, these days slightly more attainable on a more diverse range of hardware than yesteryear. It is however still lacking key features on intel processors compared to Apple silicon which is anyway likely leading to Hackintosh’s demise. Additionally, macOS VMs aren’t officially supported or quite frankly legal. I’m quite confident this is also a small minority of users.
  2. Disclaimer this is not a truly scientific test as many of the aspects of the device are not invariants, but it gives a good enough idea of new vs old
  3. Of course, this can already be done manually now.
  4. Again a disclaimer, this was a result of one run, no statistical significance testing, and with hardware with other variables. The results are just to provide a rough picture.
 
 
  •  
 

Want to join Philip Leonard in finding solutions to interesting problems?