Azure

Error: Requests under Azure OpenAI API have exceeded token rate limit of S0 pricing tier

Ahmed Aboulnaga

Aug 22, 2024 • 4 min read

My Problem

When I call the Azure OpenAI API through my Python code, I get this error:

openai.error.RateLimitError: Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-05-15 have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

The first few calls worked fine. As I continued making changes to my code, I started receiving the error above. Was it my code changes? Was it the fact that I changed the API key I was using? Or was it something else?

I spent hours trying to figure out what's going on, and hopefully this blog post will walk you through what I did to get through this.

ChatGPT said that "the error you're encountering is related to Azure OpenAI's rate limiting, which can occur even if you have received a quota increase."

See solution at the bottom of this blog post.

Reviewing My Setup

When you log in to the Azure portal at https://portal.azure.com, click on Azure OpenAI to be taken to the Azure AI services page.

Here you can see that I've actually already created my first Azure OpenAI service (I'm not showing the creation process here, but it's pretty straightforward). There are links under Endpoints and Manage keys that take you to the same page.

Clicking on either of those links takes you here. I needed the key and endpoint values to use in my code.

This is how the values were used in my Python code.

# API type
openai.api_type = "azure"

# Azure OpenAI endpoint
openai.api_base = "https://ahmedopenai1.openai.azure.com/"

# See https://stackoverflow.com/questions/76475419/how-can-i-select-the-proper-openai-api-version
openai.api_version = "2023-05-15"
#openai.api_version = "2023-03-15-preview"

# API key from an OS environment variable or just paste actual value
openai.api_key = "fa***************************be4"
#openai.api_key = os.getenv('API-KEY')

Clicking on Pricing tier, you can see that I have the S0 Standard pricing tier. The original error had complained about my pricing tier.

Step 1: Increase the OpenAI Quota

If you look at the original error above, it mentions to navigate to https://aka.ms/oai/quotaincrease and request a quota increase, which is what I did.

The screenshot below is somewhat condensed for brevity, but I entered all required information. The justification I provided was simple and general in nature, but a corporate email is necessary.

What Quota Request Type should I choose though (see question #7)? Global Standard? Global Batch? PAYGO? Provisioned?

I navigated to https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/deployment-types and found Azure recommending Global-Standard as a starting place for customers.

As for the actual total quota to request, this page helped me calculate that 500 tokens is a maximum of $10 daily on the GPT-4o Global Deployment model.

I received an email confirmation stating that "Your request to increase the quota limit has been approved".

This did not resolve the error.

Step 2: Confirm that the Quota is Increased in AOAI Studio

I navigated to the Azure OpenAI Studio (aka AOAI Studio) at https://oai.azure.com and clicked on Quota and selected the Azure OpenAI Global-Standard tab. I can now see that my quota increased from '0 of 8' to '0 of 500'.

This did not resolve the error.

Step 3: Upgrade to a Pay-As-You-Go Plan

There are many links on the Azure Portal console that entice you to upgrade the plan from the free one with $200 worth of credit for the first month, to which I upgraded to the pay-as-you-go without any technical support.

I received a confirmation email a few minutes later confirming this was done.

This did not resolve the error.

Step 4: Wait for 6 Hours

I gave up. At this point, I went out to lunch, attended a few meetings, then came back. Everything magically worked fine afterwards.

It seems that even after the increased quota shows up in AOAI Studio, you just have to give it some time. I suppose ChatGPT was right.

UPDATE 8/24/2024: It seems that the error returned again despite my token usage limit still remaining low (at 1 of 500). There are likely stability or capacity issues going on in Azure and I have no choice but to drop my effort and move back to developing against OpenAI instead of Azure OpenAI.

Update 10/29/2024: These steps finally resolved my issue.

Navigate to https://oai.azure.com/
Click on 'Deployments'
Click on your deployed model (e.g., 'gpt-4o')
Click on 'Edit'
Increase the Tokens per Minute Rate Limit from 1K to 400K
Click on 'Save and close'