We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Setting up Babelfish for Aurora PostgreSQL for performance testing using HammerDB
In this blog post, I will provide details on how to set up
1. Introduction
Whether it’s a component of migrating to Amazon Web Services or optimizing workloads already on Amazon Web Services, customers are looking at the options to modernize their SQL Server workloads. An attractive modernization option is to migrate these workloads to open-source platforms, like Babelfish, to avoid expensive Microsoft licenses, vendor lock-in periods, and audits.
Babelfish for Aurora PostgreSQL is a capability of Amazon Aurora PostgreSQL-Compatible Edition that enables Aurora to understand commands from applications written for Microsoft SQL Server. By allowing Aurora PostgreSQL to understand
As Babelfish matures, providing more features and capabilities, more and more SQL Server workloads can be migrated to Babelfish. So it is unsurprising that many Amazon Web Services customers are
Besides
This blog post highlights the issues of using HammerDB with Babelfish and shows how to overcome them to successfully implement Babelfish performance testing using HammerDB.
2. Prerequisites
You will need an
First, you need to install the HammerDB tool
To run performance testing, you will need a Babelfish for Aurora PostgreSQL DB cluster, which you can create following
3. Building HammerDB OLTP database on Babelfish
To use HammerDB for performance benchmarking, it is necessary to first build a test database. HammerDB supports
HammerDB supports building the OLTP test database on multiple platforms. As Babelfish is compatible with SQL Server, when generating the test database using HammerDB, I
Figure 1. Setting HammerDB Schema Build options.
For the SQL Server field, you will use the
For SQL Server User ID and SQL Server User Password fields, use the User ID and Password you specified when
For TP ROC -C SQL Server Database field, enter your Babelfish cluster database name. While HammerDB can create and populate a test schema into an existing empty SQL Server database, with Babelfish, if you select the existing database, the loading of HammerDB database will fail ! Thus, it is important to let HammerDB create a database instead of generating a schema into an existing database.
The field Number of Warehouses defines the size of the test database, with each warehouse using about 100 MB of database space. The larger the value of Virtual Users to Build Schema , the faster the build will proceed as virtual users will build the warehouses in parallel. Specify a number of virtual users that are equal or less than the target number of warehouses. Building a 10-warehouse database, as shown in Figure 1, may take 10 to 15 minutes, depending on the size of your test driver instance and Aurora for PostgreSQL cluster instance hosting the Babelfish database.
If you plan to build a large test database, then specifying a large number of virtual users would be highly beneficial as long as the test driver instance running HammerDB can support that many concurrent users and the Babelfish DB cluster uses a sufficiently large host. For example, to create a test database with 75,000 warehouses using 500 virtual users, as shown in Figure 2, I configured Aurora for PostgreSQL DB cluster with a
Figure 2. HammerDB Schema Build options for a large database.
It is important to note that when the rest of the virtual users finish building and populating the database, HammerDB virtual user #1 will continue to create indexes and stored procedures, as shown in Figure 3, which may also take significant time for a large test database.
Figure 3. Virtual user #1 continues working after other users finished
4. Post-build configuration of HammerDB OLTP database
At the end of the build process, while all worker users terminate successfully, virtual user #1 will terminate with an error, as shown in Figure 4.
Figure 4. User #1 terminates with the error
After analyzing the build log, I found out that this error was related to creating the dbo.sp_updstats procedure. This error happened because Babelfish currently does not support the qualifier with execute as ‘dbo’ as shown in Figure 5:
CREATE PROCEDURE dbo.sp_updstats
with execute as 'dbo'
as exec sp_updatestats
Figure 5. Failed CREATE P ROC EDURE script
Remediation is simple–connect to your new Babelfish database using SSMS (or another tool compatible with SQL Server) and create procedure dbo.sp_updstats using the script shown in Figure 6:
CREATE PROCEDURE dbo.sp_updstats
as exec sp_updatestats
Figure 6. Changed script for sp_updstats HammerDB stored procedure.
HammerDB calculates a system-independent performance metric of
HammerDB calculates TPM based upon system view sys.dm_os_performance_counters , which is
-- DDL for sys.dm_os_performance_counters view
-----------------------------------------------------
--CREATE OR REPLACE VIEW sys.dm_os_performance_counters
AS
SELECT 'SQLServer:SQL Statistics'::text AS object_name,
'Batch Requests/sec'::text AS counter_name,
272696576 AS cntr_type,
sum (pg_stat_database.xact_commit
+ pg_stat_database.xact_rollback)::integer
AS cntr_value
FROM pg_stat_database
WHERE pg_stat_database.datname = 'babelfish_db'::name;
--------------------------------------------------------------
-- To Enable access from TDS-SQL side execute following commands:
--------------------------------------------------------------
ALTER VIEW sys.dm_os_performance_counters OWNER TO master_dbo;
GRANT SELECT ON sys.dm_os_performance_counters TO PUBLIC;
Figure 7. Script for sys.dm_os-performance_counters limited view
With these minor changes, the Babelfish database is ready for HammerDB performance benchmarking.
5. Running HammerDB performance test against Babelfish
To run the HammerDB performance test, you must first configure the
Figure 8. Setting Driver Script options.
Selecting the Timed Driver Script option is appropriate for running a series of repeatable tests. The other option is to run manual benchmarks to validate installation and configuration. In the Minutes of Ramp-Up Time field, you specify the number of minutes for ramp-up before starting the test. This should include the time required to create all virtual users for the test, as well as warming up the database for the test. It is better to err on the high side because if you do not provide enough time to create all the virtual users and warm up the database (preload buffers, optimize query plans, etc.), you might get incorrect benchmarking results.
For the Minutes of Test Duration field, it is safe to use 5 , but you may experiment with higher numbers as well. To ensure greater I/O activity, select the Use All Warehouses option.
To identify the top performance of the system,
Figure 9. HammerDB Autopilot options.
Figure 9 shows a sample Autopilot configuration. The Minutes per Test in Virtual User Sequence field specifies the time allocated to run each test. The HammerDB client starts counting this time after it creates all virtual users for the test run, which is part of the ramp-up time. Thus, you may select a number smaller than the sum of ramp-up and test time; this may reduce the overall time required to run the long test sequence. Monitor the first test in the sequence; if it does not provide results before starting the next test, increase this number.
6. Identifying the size of your HammerDB test database
Babelfish performance, similar to most other databases, heavily depends on the size of the test database. With HammerDB, the size of the database depends on the number of warehouses defined when building the test database. The
Suppose you need to get the exact size of your sample Babelfish database created by HammerDB. In that case, you may use the script presented in Figure 10, which I developed specifically to address this question. To run this script,
SELECT
schema_name,
pg_size_pretty(sum(table_size)) AS GB_size,
sum(table_size) as Byte_Size
FROM (
SELECT
pg_catalog.pg_namespace.nspname AS schema_name,
relname,
pg_relation_size(pg_catalog.pg_class.oid) AS table_size
FROM pg_catalog.pg_class
JOIN pg_catalog.pg_namespace
ON relnamespace = pg_catalog.pg_namespace.oid
) t
WHERE schema_name NOT LIKE 'pg_%’ AND schema_name NOT LIKE 'sys%’
AND schema_name NOT LIKE 'information_%’ AND schema_name NOT LIKE 'master_%'
AND schema_name NOT LIKE 'temp_%’ AND schema_name NOT LIKE 'msdb_%’
GROUP BY
schema_name;
Figure 10. Script to calculate Babelfish test database size.
7. Cleanup
If you created the prerequisites defined in Section 2, dispose of them at the end of performance testing to avoid extra costs. To delete the Babelfish test database,
8. Conclusions
In this blog post, I outlined steps and provided details on how to build a sample Babelfish database to run benchmarking using the industry-standard HammerDB performance tool. I identified issues that may arise when building test databases and provided steps and scripts to resolve them. I discussed the basics of database performance testing using HammerDB and outlined important parameters.
With this information, you should be able to start Babelfish performance testing for your scenarios and get confidence in the Babelfish product to facilitate your migration from MS SQL Server to the open-source compatible Babelfish for Aurora PostgreSQL platform.
Using the approach outlined in this post, we performed multiple Babelfish performance tests and published our results in the blog post
Amazon Web Services can help you assess how your company can get the most out of cloud. Join the millions of Amazon Web Services customers that trust us to migrate and modernize their most important applications in the cloud. To learn more on modernizing Windows Server or SQL Server, visit
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.