Azure SQL Managed Instance - Windows Authentication memory leak
Collaborating closely with our valued customers and Microsoft to ensure optimal outcomes is a core focus for our team. Recently, we assisted a client in migrating their on-prem SQL Servers to a single Azure SQL Managed Instance, only to encounter a recurring challenge of multiple failovers each week. Our monitoring software detected restarts, revealing that nodes were consistently failing over to other nodes within the cluster.
We asked the customer to raise ticket with Microsoft, sharing vital support data that highlighted an intriguing observation—server memory decreasing over time. This prompted an in-depth investigation by Microsoft, leading to the discovery of a bug in the Azure SQL Managed Instance code.
According to Microsoft’s findings, “After thorough investigation PG has identified an issue regarding to some memory objects on Windows Authentication code path not being cleaned up completely, so memory usage by Azure SQL Managed Instance constantly grows in direct correlation with number of Windows Authentication requests. Due to memory pressure which happens over time, SQL can cut off connections or failover. In an effort to prevent similar issues in the future, we are treating this issue with highest priority and fix has already been developed. Alongside we are working on mechanisms to detect and prevent these kind of issues as early as possible.“
We are awaiting the deployment of the fix and believe this experience underscores the critical importance of robust monitoring and resource management for all SQL environments, including in the Cloud.
For more insights into SQL Agility and our services, visit SQL Agility Monitoring