## ACTS Blog Selection

We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.

# Amazon Web Services releases open-source software Palace for cloud-based electromagnetics simulations of quantum computing hardware

Today, we are introducing **PA**rallel, **LA**rge-scale **C**omputational **E**lectromagnetics, a parallel finite element code for full-wave electromagnetics simulations. Palace is used at the

We are making Palace

## Why did we build Palace?

Computational modeling typically requires scientists and engineers to make compromises between model fidelity, wall-clock time, and computing resources. Recent advances in cloud-based HPC have

Palace uses scalable algorithms and implementations from the scientific computing community and supports recent advancements in computational infrastructure to deliver state-of-the-art performance. On Amazon Web Services, this includes the

Lastly, we built Palace because while there exist many highly performant, open-source tools for a wide range of applications in computational physics, there are few open-source solutions for massively parallel, finite element-based computational electromagnetics. Palace supports a wide range of simulation types: eigenmode analysis, driven simulations in the frequency and time domains, and electrostatic and magnetostatic simulations for lumped parameter extraction. As an open-source project, it is also fully extensible by developers looking to add new features for problems of industrial relevance. Much of Palace is made possible by the

Palace adds to the ecosystem of open-source software supporting cloud-based numerical simulation and HPC, which enables the development of custom solutions and cloud infrastructure for simulation services and gives you more choice than existing alternatives.

## Examples of electromagnetics simulations for quantum hardware design

In this section we present two example applications which demonstrate some of the key features of Palace and its performance as a numerical simulation tool. For all of the presented applications, we configured our cloud-based HPC cluster to compile and run Palace using GCC v11.3.0, OpenMPI v4.1.4, and EFA v1.21.0 on Amazon Linux 2 OS. We used

### Transmon qubit and readout resonator

The first example considers a common problem encountered in the design of superconducting quantum devices: the simulation of a single transmon qubit coupled to a readout resonator, with a terminated coplanar waveguide (CPW) transmission line for input/output. The superconducting metal layer is modeled as an infinitely thin, perfectly conducting surface on top of a c-plane sapphire substrate.

An eigenmode analysis is used to compute the linearized transmon and readout resonator mode frequencies, decay rates, and corresponding electric and magnetic field modes. Two finite element models are considered: a fine model with 246.2 million degrees of freedom, and a coarse model with 15.5 million degrees of freedom that differs by 1% in the computed frequencies as compared to the fine model. For the interested reader, the governing Maxwell’s equations are discretized using third-order, H(curl)-conforming Nédélec elements in the fine model, and similar first-order elements in the coarse model, on a tetrahedral mesh. Figure 1 shows the 3D geometry of the transmon model and a view of the mesh used for simulation. Visualizations of the magnetic field energy density for each of the two computed eigenmodes are also shown in Figure 2.

For each of the two models, we scale the number of cores used for the simulation in order to investigate the scalability of Palace on Amazon Web Services when using a variety of EC2 instance types. Figure 3 plots the simulation wall-clock times and computed speedup factors for the coarse model, while Figure 4 plots them for the higher-fidelity fine model. We observe simulation times of approximately 1.5 minutes and 12 minutes for the coarse and fine models, respectively, achieved with the scalability of EC2. Notice also the improved performance of c7g.16xlarge instance type, featuring the latest generation Amazon Web Services Graviton3 processor, over the previous generation c6gn.16xlarge, often matching the performance of the latest Intel-based instance types.

### Superconducting metamaterial waveguide

The second example to demonstrate the capabilities and performance of Palace involves the simulation of a superconducting metamaterial waveguide based on a chain of lumped-element microwave resonators. This model is constructed in order to predict the transmission properties of the device presented in Zhang et al., Science 379 (2023)

We consider models of increasing complexity starting from a single unit-cell (see Figure 5 below), with 242.2 million degrees of freedom, and increasing to 21 unit-cells, with 1.4 billion degrees of freedom. The complexity of simulating this device comes from geometric features over a large range of length scales, with trace widths of 2 μm relative to an overall model length of 2 cm. The number of EC2 instances used for the simulation is increased with the number of metamaterial unit-cells, to maintain a constant number of degrees of freedom per processor.

Figure 5 shows the metamaterial waveguide geometry for the 1, 4, and 21 unit-cell simulation cases. Computed filter responses for each of the simulation cases are plotted in Figure 6, where we see the frequency response become more complex as the number of unit-cell repetitions increases. The solution computed using the adaptive fast frequency sweep algorithm is checked against a few uniformly sampled frequencies and both solutions show good agreement over the entire frequency band.

We plot in Figure 7 the wall-clock time required to run the simulations across an increasing number of cores as the models become more complex. All models are simulated on c6gn.16xlarge instances with the largest case using 200 instances or 12,800 cores. Wall-clock simulation time is higher when using the adaptive fast frequency sweep, but this is because the uniform sweep provides the frequency response at only 17 sampled frequencies while the fast adaptive sweep is much higher resolution, using 4001 points. For the 21 unit-cell repetition, the uniform frequency sweep would take roughly 27 days to achieve the same fine 1 MHz resolution if each frequency point is sampled sequentially.

As a final comment, the increase in simulation wall-clock time as the model complexity grows, even as the number of degrees of freedom per core is kept roughly constant, is attributed to more linear solver iterations required for convergence as the linear system of equations assembled for each frequency becomes more difficult to solve. Likewise, the adaptive frequency sweep requires a larger number of frequency samples, hence more wall-clock simulation time, to maintain the specified error tolerance as the number of unit-cells in the model increases.

## Conclusion and next steps

This blog post introduced the newly released, open-source finite element code Palace for computational electromagnetics simulations. Palace is licensed under the Apache 2.0 license and is free to use by anyone from the broader numerical simulation and HPC community. Additional information can be found in the

We have also presented the results of a few example applications, running Palace on a variety of EC2 instance types and numbers of cores. The simplest way to get started with Palace on Amazon Web Services is to use Amazon Web Services-supported