Pages

Wednesday, November 29, 2023

Python script that will asynchronously receive events from an Azure Event Hub and send it to a Log Analytics Workspace custom table

One of the key items I’ve been working on over the past week as a follow up to my previous post:

How to log the identity of a user using an Azure OpenAI service with API Management logging (Part 1 of 2)
https://terenceluk.blogspot.com/2023/11/how-to-log-identity-of-user-using-azure.html

… is to write a Python script that will read events as they arrive in an Event Hub, then send it over to a Log Analytics Workspace’s custom table for logging. The topology is as such:

image

The main reason why I decided to go with this method is because:

Tutorial: Ingest events from Azure Event Hubs into Azure Monitor Logs (Public Preview)
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/ingest-logs-event-hub

… required the Log Analytics workspace to be linked to a dedicated cluster or to have a commitment tier. The lowest price for such a configuration would be cost prohibitive for me to deploy in a lab environment so I decided to build this simple ingestion method.

Log Analytics Pricing Tiers:

image

I used various documentation available to create the script, create the App Registration, configure the Data Collection Endpoint and Data Collection Rule for the Log Analytics ingestion. Here are a few for reference:

Send events to or receive events from event hubs by using Python
https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-python-get-started-send?tabs=passwordless%2Croles-azure-portal

Logs Ingestion API in Azure Monitor
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/logs-ingestion-api-overview

Tutorial: Send data to Azure Monitor Logs with Logs ingestion API (Azure portal)
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-logs-ingestion-portal

The script can be found at my GitHub repository here: https://github.com/terenceluk/Azure/blob/main/Event%20Hub/Python/Receive-from-Event-Hub-with-checkpoint-store-async.py

The following are some screenshots of the execution and output:

OpenAI API Call from Postman to API Management:

image

Script Execution and Output:

image

Log Analytics Ingestion Results:

image

I hope this helps anyone who might be looking for a script for the processing of events and ingestion to Log Analytics as it took me quite a bit of time on and off to troubleshoot various issues encountered. With this script out of the way, I am no prepared to finish up the 2 of 2 post for an OpenAI logging end to end solution, which I will be writing shortly.

Sunday, November 26, 2023

"204 No Content" returned in Postman when attempting to write logs to a data collection endpoint with a data collection rule for Log Analytics custom log ingestion

I’ve been working on my Part 2 of 2 post to demonstrate how we can use Event Hubs to capture the identity of incoming API access for the Azure OpenAI service published by an API Management and while doing so, noticed an odd behavior when attempting to use the Log Ingestion API as demonstrated outlined here:

Logs Ingestion API in Azure Monitor
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/logs-ingestion-api-overview

image 

I configured all of the required components and wanted to test with Postman before updating the Python script I had for ingesting Event Hub logs but noticed that I would constantly get a 204 No Content status return with no entries added to the Log Analytics table I had set up. To make a long story short, the issue was because the JSON body I was submitting was not enclosed with square brackets [] and further tests show that regardless of whether the accepted format (with square brackets) was submitted or not, the same 204 No Content would be returned.

The following is a demonstration of this in Postman.

The variables I have defined in Postman are:

  • Data_Collection_Endpoint_URI
  • DCR_Immutable_ID
  • client_id_Log_Analytics
  • client_secret_Log_Analytics

image

The following are where we can retrieve the values:

The Data_Collection_Endpoint_URI can be retrieved by navigating to the Data collection endpoint you had setup:

image

The DCR_Immutable_ID can be retrieved in the JSON view of the Data collection rule that was setup:

image

The client_id_Log_Analytics is located in the App Registration object:

image

The client_secret_Log_Analytics is the secret setup for the App Registration:

image

You’ll also need your tenant ID for the tenantId variable.

Set up the authorization tab in Postman with the following configuration:

Type: OAuth 2.0

Add authorization data to: Request Headers

Token: Available Tokens

Header Prefix: Bearer

Token Name: <Name of preference>

Grant type: Client Credentials

Access Token URL: https://login.microsoftonline.com/{{tenant_id}}/oauth2/v2.0/token

Client ID: {{client_id_Log_Analytics}}

Client Secret: {{client_secret_Analytics }}

Scope: https://monitor.azure.com/.default

Client Authentication: Send as Basic Auth header

Leave the rest as default and click on Get New Access Token:

imageimage

The token should be successfully retrieved:

image

Click on Use Token:

image

Configure a POST request with the following URL:

https://{{Data_Collection_Endpoint_URI}}/dataCollectionRules/{{DCR_Immutable_ID}}/streams/Custom-APIMOpenAILogs_CL?api-version=2021-11-01-preview

The Custom-APIMOpenAILogs_CL value can be retrieved in the JSON View of the Data collection rule:

image

Proceed to configure the following for the Params tab:

api-version: 2021-11-01-preview

image

The Authorization key should be filled out with the token that was retrieved.

Set the Content-Type to application/json.

image

For the body, let’s test with the JSON content WITHOUT the square brackets:

{

"EventTime": "11/24/2023 8:19:57 PM",

"ServiceName": "dev-openai-apim.azure-api.net",

"RequestId": "91ff7b54-a0eb-4ada-8d27-6081f71e44a3",

"RequestIp": "74.114.240.15",

"OperationName": "Creates a completion for the chat message",

"apikey": "6f82e8f56e604e6cae6e0999e6bdc013",

"requestbody": {

"messages": [

            {

"role": "user",

"content": "Testing without brackets."

            }

        ],

"temperature": 0.7,

"top_p": 0.95,

"frequency_penalty": 0,

"presence_penalty": 0,

"max_tokens": 800,

"stop": null

    },

"JWTToken": "bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6IlQxU3QtZExUdnlXUmd4Ql82NzZ1OGtyWFMtSSIsImtpZCI6IlQxU3QtZExUdnlXUmd4Ql82NzZ1OGtyWFMtSSJ9.eyJhdWQiOiJhcGk6Ly8xMmJjY2MyNi1iNzc4LTRhMmQtYWU3YS00ZjU3MzJlN2E3OWQiLCJpc3MiOiJodHRwczovL3N0cy53aW5kb3dzLm5ldC84NGY0NDcwYi0zZjFlLTQ4ODktOWY5NS1hYjBmNTE0MzAyNGYvIiwiaWF0IjoxNzAwODU2MzQ4LCJuYmYiOjE3MDA4NTYzNDgsImV4cCI6MTcwMDg2MTMyNSwiYWNyIjoiMSIsImFpbyI6IkFUUUF5LzhWQUFBQUN5NDZNdUg4VG0yWTF3VDkvazZWVjFzcU9oUWZaOFU5N0ExcWRyT0FMYThGcVVsTEhRclN2OVlwNU5hUE94QnMiLCJhbXIiOlsicHdkIl0sImFwcGlkIjoiMTJiY2NjMjYtYjc3OC00YTJkLWFlN2EtNGY1NzMyZTdhNzlkIiwiYXBwaWRhY3IiOiIxIiwiZmFtaWx5X25hbWUiOiJUdXpvIiwiZ2l2ZW5fbmFtZSI6Ilpha2lhIiwiaXBhZGRyIjoiNzQuMTE0LjI0MC4xNSIsIm5hbWUiOiJaYWtpYSBUdXpvIiwib2lkIjoiZWUxMTZkNTktZDQ5Yi00NTU3LWIyYWItYzkxMWY0NTFkNWM4Iiwib25wcmVtX3NpZCI6IlMtMS01LTIxLTIwNTcxOTExOTEtMTA1MDU2ODczNi01MjY2NjAyNjMtMTg0MDAiLCJyaCI6IjAuQVZFQUMwZjBoQjRfaVVpZmxhc1BVVU1DVHliTXZCSjR0eTFLcm5wUFZ6TG5wNTFSQUU0LiIsInJvbGVddeeQSU0uQWNjZXNzIl0sInNjcCI6IkFQSS5BY2Nlc3MiLCJzdWIiOiJKR3JLbXB4NjVDOGNqRGxUVXBDZFZKaHFoSmtkelJ6b3lJZURENWRMNUhRIiwidGlkIjoiODRmNDQ3MGItM2YxZS00ODg5LTlmOTUtYWIwZjUxNDMwMjRmIiwidW5pcXVlX25hbWUiOiJaVHV6b0BibWEuYm0iLCJ1cG4iOiJaVHV6b0BibWEuYm0iLCJ1dGkiOiJRRWx2U05CX29rUzFLZnV0NTVFNUFBIiwidmVyIjoiMS4wIn0.a__8D9kLedJi48Q9QuEPWUjhqVWJeTZVXkDIcV-gQ5DYCjU7SjwDQWGc1dsYZ_nD0SH4id-PGiTa3RaZo_y5jrtJs_UoW3L8KmViKF1llqaK5XRw7fbGtdPJsFcDXfcWd-hLlWIorjSZ6MdS4beRx4mPTOfeomFWL6e2ExMBzELe_1MzJaUtbYkfZlhoOQu1TUaIoOM5Qs5PpFO1oO-ihcKu3Vl-aY_rmItB1fzRXIip-LQqUVmOwBjOWrzSVkYWRFGnsO1jZNWp0GJKqzVJJFCqNBgZf4BfjN0vvIXRhsR5dGJqd1AAS8VsczZOSBV2uutixNnjJ3jVIZIOa31wzg",

"AppId": "12bccc26-b778-4a2d-bb7a-4f5732e7a79d",

"Oid": "ee116d59-d49b-4557-b2ab-c911f451d5c8",

"Name": "Terence Luk"

    }

image

Notice the returned 204 status:

204 No Content

The server successfully processed the request, but is not returning any content.

Waiting for an indefinite time will show that the log is not written to Log Analytics.

image

Now WITH square brackets:

image

Notice the same 204 status is returned:

image

However, using the square brackets show that the log entry is successfully written:

image

All the GitHub and forum posts have others indicating this appears to be the expected behavior so the entry will be written as long as the square brackets are included.

I will be including the instructions on setting up the App Registration, Data Collection Endpoint, Data Collection Rule, and other components in my part 2 of 2 post for logging the identity of an OpenAI call through the API Management.

Friday, November 17, 2023

How to log the identity of a user using an Azure OpenAI service with API Management logging (Part 1 of 2)

The single question I’ve been asked the most over the past few months from colleagues, clients, other IT professionals is how can we identify exactly who is using the Azure OpenAI service so we can generate accurate consumption reports and allow proper charge back to a department? Those who have worked with the diagnostic settings for Azure OpenAI and API Management will know that logging is available but there are gaps that desperately needs to be addressed. A quick search over the internet will show that using API Management can log the caller’s IP address but that isn’t very useful for obvious reasons such as:

  1. If it’s public traffic with a public inbound IP address, how would we be able to tell who the user is?
  2. Even if we can tie a public IP address to an organization because that’s the outbound NAT, the identity of the user is not captured
  3. Even if we authenticate the user so a JWT token is provided to call the API, having the public IP address in the logs alone wouldn’t identify the user
  4. If these were private IP addresses, it would be a nightmare to try and match the inbound IP address with an internal workstation’s IP address that is likely DHCP

I believe the first time I was asked this question was 3 months ago and I’ve always thought that Microsoft will likely address this soon with a checkbox in the diagnostic settings or some other easy to configure offering but fast forward to today (November 2023), I haven’t seen a solution so I thought I’d do a bit of R&D over the weekend.

The closest solution I was able to find is from this DevRadio presentation:

Azure OpenAI scalability using API Management
https://www.youtube.com/watch?v=mdRu3GJm3zE&t=1s

… where the presenter used multiple instances of Azure OpenAI to separate prompts to the OpenAI service belonging to different business units. While this solution allowed costs to be separated between predefined business units, the thought of telling a client that I need multiple instances to serve this purpose didn’t seem like something they would be receptive. While the DevRadio solution did not meet the requirements I had, it did give me the idea that perhaps I can use the logging of events to event hubs feature of the Azure API Management to accomplish what I want in the solution.

I have to say that this blog post is probably one of the most exciting one I’ve written in a while because I was heads down focused on learning and testing the Azure API Management inbound processing capabilities over 3 days of my vacation time off and felt extremely fulfilled that I now have an answer to what I could not provide a solution to for months.

If you’re still reading this, you might be wondering why there is the label Part 1 of 2 and the reason is because I ran out of time and have gotten back to a busy work schedule so could not finish the last portion of this solution but don’t worry as what I will cover in Part 1 will at least capture the information to identify the calling user. Here is a summary of what I am able to cover in this blog post:

  1. How to set up API Management to log events to Event Hub
  2. What inbound processing code should be inserted to send the OAuth JWT token to event hub
  3. What inbound processing code can be used to extract any values in the JWT token to event hub
  4. How to view the logged entries in event hub

The following is what I will cover in Part 2 in a future post:

  1. How to ingest events from Azure Event Hubs into Azure Monitor Logs
  2. How to use KQL to join events logged by API Management’s diagnostic settings (containing token usage, prompt information) with Azure Event Hub ingested logs (containing user identification)

The following is a high level architecture diagram and the flow of the traffic:

image

I’m excited to get this post published so let’s get started.

Prerequisites

This solution will require us to place an Azure API Management service in front of the Azure OpenAI service so API calls are:

  1. Logged by the APIM
  2. Authorized with OAuth by the API Management

Please refer to my previous post for how to set this up:

Securing Azure OpenAI with API Management to only allow access for specified Azure AD users
https://terenceluk.blogspot.com/2023/11/securing-azure-openai-with-api.html

What is available today out-of-the-box: API Management Diagnostic Settings Logging Capabilities

Assuming you have configured the API Management service as I demonstrated in my prerequisite section and Diagnostics Logging is turned on:

imageimage

… then a set of information for each API call would be logged in the configured Log Analytics. Let’s first review what is available out-of-the-box when for the API Management. The complain I hear repeatedly is that while the logs captured by the API Management provide all the following great information:

  • TenantId
  • TimeGenerated [UTC]
  • OperationName
  • Correlationid
  • Region
  • isRequestSuccess
  • Category
  • TotalTime
  • CalleripAddress
  • Method
  • Url
  • ClientProtocol
  • ResponseCode
  • BackendMethod
  • BackendUrl
  • BackendResponseCode
  • BackendProtocol
  • RequestSize
  • ResponseSize
  • Cache
  • BackendTime
  • Apid
  • Operationid
  • ApimSubscriptionid
  • ApiRevision
  • ClientTlsVersion
  • RequestBody
  • ResponseBody
  • BackendRequestBody
imageimageimage

None of these captured fields allow for identifying the caller. To address this gap, we can leverage the log-to-eventhub inbound processing feature of API Management and Event Hubs to send additional information about the inbound API call to an event hub, then process it according to our requirements.

Turning on the logging of events for the API Management to Event Hubs

The first step for this solution is to turn on the feature that has API Management log to an Event Hub. I won’t go into the usual detail I provide for setting up the components due to my limited time but begin by creating an Event Hub Instance and Event Hub as shown in the following screenshots to serve as a destination for the APIM to send its logs:

imageimageimage

Once the Event Hub Instance and Event Hub is created, and the API Management’s System Managed Identity is granted, we will use the following instructions to turn on the feature in API Management and use the Event Hub:

Logging with Event Hub
https://azure.github.io/apim-lab/apim-lab/6-analytics-monitoring/analytics-monitoring-6-3-event-hub.html

More detail about how the API Management is configured is described here:

How to log events to Azure Event Hubs in Azure API Management
https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-log-event-hubs?tabs=PowerShell

Configuring API Management’s Inbound Processing rule to log JWT token and its values

The API Management log-to-eventhub can send any type of information to the Event Hub. For this post, I am going to demonstrate how to send the following information:

  • EventTime
  • ServiceName
  • RequestId
  • RequestIp
  • Operationname
  • api-key
  • request-body
  • JWTToken
  • AppId
  • Oid
  • Name

Let’s go through these fields in a bit more detail. The following list of fields:

  • EventTime
  • ServiceName
  • RequestId
  • RequestIp
  • Operationname
  • request-body

… are ones that can be retrieved from the out-of-the-box diagnostic settings logs. I haven’t looked into all the available fields but I suspect that we can send all the out-of-the-box diagnostic settings to event hub to recreate what we have and potentially allow us to turn off the built in logging. The advantage of such an approach is that all logs will be stored in a single log analytics workspace table. The disadvantage of such an approach is that if new fields are introduced into the built in logs then we would need to update our log-to-eventhub code to capture those fields.

The other fields:

  • api-key
  • JWTToken
  • AppId
  • Oid
  • Name

… are ones that we’re looking for. The api-key probably isn’t as important, but I wanted to include this to show that it can be captured. The JWT Token that was passed to the API Management is captured and while it can be copied out, then decoded with https://jwt.io/, it isn’t very useful if we’re trying to use KQL to generate reports. The remaining fields, which is probably what everyone is looking for, AppId, Oid, Name are extracted from the fields in the JWT Token. These fields are just examples that I included in the demonstration, and it is possible to extract any other field you like by adding to the inbound processing XML code.

Navigate to the API Management service, APIs blade, Azure OpenAI Service API, All Operations, then click on the </> policy code editor icon under Inbound processing:

image

The following is the XML code insert that you’ll need so that the fields listed above will be captured and sent to the Event Hub:

<!--

    IMPORTANT:

    - Policy elements can appear only within the <inbound>, <outbound>, <backend> section elements.

    - To apply a policy to the incoming request (before it is forwarded to the backend service), place a corresponding policy element within the <inbound> section element.

    - To apply a policy to the outgoing response (before it is sent back to the caller), place a corresponding policy element within the <outbound> section element.

    - To add a policy, place the cursor at the desired insertion point and select a policy from the sidebar.

    - To remove a policy, delete the corresponding policy statement from the policy document.

    - Position the <base> element within a section element to inherit all policies from the corresponding section element in the enclosing scope.

    - Remove the <base> element to prevent inheriting policies from the corresponding section element in the enclosing scope.

    - Policies are applied in the order of their appearance, from the top down.

    - Comments within policy elements are not supported and may disappear. Place your comments between policy elements or at a higher level scope.

-->

<policies>

<inbound>

<base />

<set-header name="api-key" exists-action="append">

<value>{{dev-openai}}</value>

</set-header>

<validate-jwt header-name="Authorization" failed-validation-httpcode="403" failed-validation-error-message="Forbidden" output-token-variable-name="jwt-token">

<openid-config url=https://login.microsoftonline.com/{{Tenant-ID}}/v2.0/.well-known/openid-configuration />

<issuers>

<issuer>https://sts.windows.net/{{Tenant-ID}}/</issuer>

</issuers>

<required-claims>

<claim name="roles" match="any">

<value>APIM.Access</value>

</claim>

</required-claims>

</validate-jwt>

<set-variable name="request" value="@(context.Request.Body.As<JObject>(preserveContent: true))" />

<set-variable name="api-key" value="@(context.Request.Headers.GetValueOrDefault("api-key",""))" />

<set-variable name="jwttoken" value="@(context.Request.Headers.GetValueOrDefault("Authorization",""))" />

<log-to-eventhub logger-id="event-hub-logger">@{

        var jwt = context.Request.Headers.GetValueOrDefault("Authorization","").AsJwt();

        var appId = jwt.Claims.GetValueOrDefault("appid", string.Empty);

        var oid = jwt.Claims.GetValueOrDefault("oid", string.Empty);

        var name = jwt.Claims.GetValueOrDefault("name", string.Empty);

         return new JObject(

             new JProperty("EventTime", DateTime.UtcNow.ToString()),

             new JProperty("ServiceName", context.Deployment.ServiceName),

             new JProperty("RequestId", context.RequestId),

             new JProperty("RequestIp", context.Request.IpAddress),

             new JProperty("OperationName", context.Operation.Name),

             new JProperty("api-key", context.Variables["api-key"]),

             new JProperty("request-body", context.Variables["request"]),

             new JProperty("JWTToken", context.Variables["jwttoken"]),

             new JProperty("AppId", appId),

             new JProperty("Oid", oid),

             new JProperty("Name", name)

         ).ToString();

     }</log-to-eventhub>

</inbound>

<backend>

<base />

</backend>

<outbound>

<base />

</outbound>

<on-error>

<base />

</on-error>

</policies>

image

The XML code can be found at my GitHub Repository: https://github.com/terenceluk/Azure/blob/main/API%20Management/XML/Capture-APIM-Traffic-and-JWT-Token-Information.xml

Proceed to click on the Save button and additional set-variable and log-to-eventhub policies should be displayed under Inbound processing:

image

With the API Management’s inbound processing rule updated, initiating API calls to the APIM to generate request traffic and let it capture the information. Once a few requests have been made, navigate to the Event Hub then Process data:

image

Within the Process data blade, click on the Start button for Enable real time insights from events:

image

Click on the Test Query button to load the captured logs:

image

The logs typically take a minute or 2 to show up so if no logs are displayed then try executing the Test query again after a few minutes:

image

We can see that it is possible for us to edit the inbound processing policy to recreate the type of log entries the API Management out-of-the-box diagnostic settings but if that is not desired, it is possible to map the logs in the Event Hub to the logs in the diagnostic settings with the use of the RequestId from the Event Hub logs and the CorrelationId of the APIM diagnostic settings logs as shown in the screenshots below:

RequestID from Event Hub

image

CorrelationId from API Management Diagnostic Settings Logs

image

Note that there are different views available in the Event Hub logs. Below is a Raw view displayed as JSON:

image

As mentioned earlier, the JWT token passed for authorization is captured and it is possible to decode the value to view the full payload. If any additional fields are desired then the inbound processing policy can be modified to capture this information:

image

Now that we have the JWT token information captured, we can send the Azure Event Hubs logs into a Log Analytics Workspace and join the 2 tables together with KQL. I will be providing a walkthrough for how to accomplish this as outlined in this document:

Tutorial: Ingest events from Azure Event Hubs into Azure Monitor Logs (Public Preview)
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/ingest-logs-event-hub

… in the part 2 of my future post.

I hope this helps anyone out there looking for a way to capture the identity of the user using the Azure OpenAI Service.

Wednesday, November 15, 2023

Securing Azure OpenAI with API Management to only allow access for specified Azure AD users

I’ve been spending most of my weekends playing around with Azure’s OpenAI service and two of the personal projects I’ve been working on are:

  1. How can I secure access to OpenAI’s API access so control can be applied to what and who can make API calls to it
  2. How can I capture identity details for the application or user making the API call if we are to secure access with OAuth

This post will focus on item #1 while I get the notes I’ve captured for #2 organized and written as a blog post.

A common method I’ve found to provide the type of security for #1 is through leveraging the API Management service so I gave this pattern a shot over the weekend to test using an Azure API Management to only allow specified Azure AD users to call the Azure OpenAI API. The following is a high level architecture diagram and the flow of the traffic:

     image

Setup Azure API Management to publish Azure OpenAI

Begin by downloading the latest Azure OpenAI inference.json from the following Microsoft documentation: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#completions

image

For the purpose of this example, I will use the latest 2023-09-01-preview: https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2023-09-01-preview/inference.json

image

Once downloaded, open the JSON file and edit the following two lines:

Use the name of the OpenAI instance to replace {endpoint}:

"url": "dev-openai/openai",

Use the full endpoint value:

"default": https://dev-openai.openai.azure.com/

image

image

With the JSON file prepared, proceed to deploy an Azure API Management resource with the SKU of choice, select the APIs blade, Add API and select OpenAPI:

image

Select Full and import the inference.json file that will automatically populate the fields, proceed to create the API:

image

Turn on the System Assigned Managed for the APIM:

image

We’ll need to allow the APIM to call Azure OpenAI with the API key:

image

… and the best way to store the key is through a KeyVault so I’ve created a secret with the API key in a KeyVault:

image

As well as granted the APIM managed system identity Key Vault Secrets User permissions to access the key:

image

With the KeyVault and OpenAI secret configured proceed to navigate to the APIM Named values blade and Add a new value:

image

Configure a named value to reference the secret in the KeyVault:

image

Note the name that you’ve used for the named value as you’ll be using it later on.

We’ll also be using the tenant ID for another configuration so repeat the same procedure and create a plain value with the tenant ID:

image

The following named values should be listed:

image

Proceed by navigating to the APIs blade, Azure OpenAI Service API, All operations, Design tab, and then click on the </> icon under the Inbound processing heading:

image

We’ll be configuring the following policy for the APIM to send a header with the name api-key and value of the secret we configured in the KeyVault:

GitHub repository: https://github.com/terenceluk/Azure/blob/main/API%20Management/XML/Set-Header-API-Key.xml

<!--

    IMPORTANT:

    - Policy elements can appear only within the <inbound>, <outbound>, <backend> section elements.

    - To apply a policy to the incoming request (before it is forwarded to the backend service), place a corresponding policy element within the <inbound> section element.

    - To apply a policy to the outgoing response (before it is sent back to the caller), place a corresponding policy element within the <outbound> section element.

    - To add a policy, place the cursor at the desired insertion point and select a policy from the sidebar.

    - To remove a policy, delete the corresponding policy statement from the policy document.

    - Position the <base> element within a section element to inherit all policies from the corresponding section element in the enclosing scope.

    - Remove the <base> element to prevent inheriting policies from the corresponding section element in the enclosing scope.

    - Policies are applied in the order of their appearance, from the top down.

    - Comments within policy elements are not supported and may disappear. Place your comments between policy elements or at a higher level scope.

-->

<policies>

<inbound>

<base />

<set-header name="api-key" exists-action="append">

<value>{{dev-openai}}</value>

</set-header>

</inbound>

<backend>

<base />

</backend>

<outbound>

<base />

</outbound>

<on-error>

<base />

</on-error>

</policies>

**Note that we use the {{ }} brackets reference the named value as a variable.

image

Proceed to save the settings.

The APIM is now set up for receiving OpenAI API calls but not with the Azure OpenAI api-key, but rather a subscription key for the APIM instance. To retrieve this key, navigate to the APIs blade, Azure OpenAI Service API, Settings tab, and then scroll down to the Subscription heading. Notice that Subscription required is enabled with the Header name and Query parameter name defined. The subscription key can be found in the Subscriptions blade:

image

API Management Logging Configuration

One last configuration that is important is the Application Insights:

image

… and Azure Monitor logging:

image

Ensure that these are enabled so APIM data plane access logs and reports can be created. A few sample reports generated with KQL can be found here: https://github.com/Azure-Samples/openai-python-enterprise-logging

Here are a few sample outputs from 2 KQL queries:

Query to identify token usage by ip and mode

ApiManagementGatewayLogs

| where tolower(OperationId) in ('completions_create','chatcompletions_create')

| where ResponseCode == '200'

| extend modelkey = substring(parse_json(BackendResponseBody)['model'], 0, indexof(parse_json(BackendResponseBody)['model'], '-', 0, -1, 2))

| extend model = tostring(parse_json(BackendResponseBody)['model'])

| extend prompttokens = parse_json(parse_json(BackendResponseBody)['usage'])['prompt_tokens']

| extend completiontokens = parse_json(parse_json(BackendResponseBody)['usage'])['completion_tokens']

| extend totaltokens = parse_json(parse_json(BackendResponseBody)['usage'])['total_tokens']

| extend ip = CallerIpAddress

| where model != ''

| summarize

sum(todecimal(prompttokens)),

sum(todecimal(completiontokens)),

sum(todecimal(totaltokens)),

avg(todecimal(totaltokens))

by ip, model

image

GitHub repository: https://github.com/terenceluk/Azure/blob/main/Kusto%20KQL/Identify-token-usage-by-ip-and-mode.kusto

Query to monitor prompt completions

ApiManagementGatewayLogs

| where tolower(OperationId) in ('completions_create','chatcompletions_create')

| where ResponseCode == '200'

| extend model = tostring(parse_json(BackendResponseBody)['model'])

| extend prompttokens = parse_json(parse_json(BackendResponseBody)['usage'])['prompt_tokens']

| extend prompttext = substring(parse_json(parse_json(BackendResponseBody)['choices'])[0], 0, 100)

image

GitHub repository: https://github.com/terenceluk/Azure/blob/main/Kusto%20KQL/Monitor-prompt-completions.kusto

If you have experience setting API Management up to capture requests to Azure OpenAI then you will already know that the only information representing the calling user the Log Analytics provide is the IP address. This isn’t very useful so I have written another post to demonstrate how to capture the OAuth token details used to make the call:

How to log the identity of a user using an Azure OpenAI service with API Management logging (Part 1 of 2)
https://terenceluk.blogspot.com/2023/11/how-to-log-identity-of-user-using-azure.html

Testing OpenAI API calls through API Management with Postman

With the API Management configuration completed, we should now be able to use Postman to test querying the APIM. I won’t go into the details of the configuration but will provide the screenshots:

https://dev-openai-apim.azure-api.net/deployments/{{gpt_mode_4}}/chat/completions?api-version={{api_env_latest}}

image

{

"messages": [

   {

"role": "user",

"content": "how many faces does a dice have?"

    }

  ],

"temperature": 0.7,

"top_p": 0.95,

"frequency_penalty": 0,

"presence_penalty": 0,

"max_tokens": 800,

"stop": null

}

image

I’ll write another post in the future to properly secure Azure OpenAI now that we APIM publishing the APIs.

Create an App Registration for securing APIM API access

With the Azure API Management configured to publish the Azure OpenAI APIs, we will now proceed to create an App Registration that will allow us to lockdown APIM access for select Entra ID / Azure AD users.

image

Provide a name for the App Registration and create the object:

image

Select the App roles blade, click on Create app role and fill out the following:

Display name: <Provide a display name>

Allowed member types: Select Users/Groups or Both (Users/Groups + Applications)

Value: APIM.Access

Description: Allow Azure OpenAI API access.

Create the app role.

imageimage

Select the Expose an API blade, and click on the Add link beside Application ID URI:

image

Leave the Application ID URI as the default and click on the Save button:

image

We’ll be using Azure CLI to quickly test the retrieval of the token so we’ll need to create a scope and add Azure CLI as an authorized client application.

Proceed to click on Add a scope and fill in the following properties:

Scope name: API.Access

Who can consent: Admins and users

Admin consent display name: Access to Azure OpenAI API

Admin consent description: Allows users to access the Azure OpenAI API

State: Enabled

image

Click on Add a client application to add the Client ID of Azure CLI 04b07795-8ddb-461a-bbee-02f9e1bf7b46 as an authorized application to retrieve a delegated access token:

image

I will also be demonstrating how to set up Postman to test the retrieval of the token so we’ll need to add the Redirect URI for the call back to Postman for the App Registration by navigating to the Authentication blade, click on Add a platform, and add the following URI: https://oauth.pstmn.io/v1/callback

image

We will also need to create a secret for the App Registration so Postman is able to securely authenticate and retrieve a delegated token on behalf of the user. Navigate to the Certificates & secrets blade, create a Client secret then save the secret: 

image

With the App Registration created, we’ll need to grant a user with the role to test calling the APIM’s OpenAI publish API. Copy the client ID of the App Registration, navigate to the Enterprise Application blade and search for the Applicaiton ID:

image

Open the Enterprise Application object, navigate to the Users and groups blade, and click on Add user/group:

image

Select the user who we’ll be testing with and assign the user:

imageimage

With the Enterprise Application configured with the user assigned, we will now proceed to lockdown the APIM inbound processing policy. Open the APIM resource in the portal, navigate to the APIs blade, Azure OpenAI Service API, Design tab, and click on the </> button under Inbound processing:

image

Proceed to add the <vadlidate-jwt> tag content and note that we use the {{Tenant-ID}} named value variable we created earlier:

GitHub Repository: https://github.com/terenceluk/Azure/blob/main/API%20Management/XML/Validate-JWT-Access-Claim.xml

<!--

    IMPORTANT:

    - Policy elements can appear only within the <inbound>, <outbound>, <backend> section elements.

    - To apply a policy to the incoming request (before it is forwarded to the backend service), place a corresponding policy element within the <inbound> section element.

    - To apply a policy to the outgoing response (before it is sent back to the caller), place a corresponding policy element within the <outbound> section element.

    - To add a policy, place the cursor at the desired insertion point and select a policy from the sidebar.

    - To remove a policy, delete the corresponding policy statement from the policy document.

    - Position the <base> element within a section element to inherit all policies from the corresponding section element in the enclosing scope.

    - Remove the <base> element to prevent inheriting policies from the corresponding section element in the enclosing scope.

    - Policies are applied in the order of their appearance, from the top down.

    - Comments within policy elements are not supported and may disappear. Place your comments between policy elements or at a higher level scope.

-->

<policies>

<inbound>

<base />

<set-header name="api-key" exists-action="append">

<value>{{bma-dev-openai}}</value>

</set-header>

<validate-jwt header-name="Authorization" failed-validation-httpcode="403" failed-validation-error-message="Forbidden">

<openid-config url=https://login.microsoftonline.com/{{Tenant-ID}}/v2.0/.well-known/openid-configuration />

<issuers>

<issuer>https://sts.windows.net/{{Tenant-ID}}/</issuer>

</issuers>

<required-claims>

<claim name="roles" match="any">

<value>APIM.Access</value>

</claim>

</required-claims>

</validate-jwt>

</inbound>

<backend>

<base />

</backend>

<outbound>

<base />

</outbound>

<on-error>

<base />

</on-error>

</policies>

image

Proceed to save and we are now ready to test with Azure CLI.

Testing Token Retrieval with Azure CLI and API Management API calls with Postman

Launch a prompt with Azure CLI available and execute:

Az login

Complete the login with the test account:

image

Next, we’ll need to copy the Application ID URI:

image

… and execute:

az account get-access-token --resource api://12bccc26-b778-4a2d-ae7a-4f5732e7a79d

image

A token should be returned:

image

Copying the token and pasting it into https://jwt.io/ should confirm that the token has the role APIM.Access:

image

You should now be able use the token to call APIM with delegated access with a 200 OK status:

image

Trying to call APIM without a token passed in the header as Authorization will fail with:

{

"statusCode": 403,

"message": "Forbidden"

}

image

Removing the user from the Enterprise Application and attempting to call APIM will also result in the same failure message:

image

{

"statusCode": 403,

"message": "Forbidden"

}

image

Testing Token Retrieval and API Management API calls with Postman 

Proceed to launch Postman, navigate to the Environments are and create the following variables.

tenant_id: <The App Registration’s Directory (tenant) ID>

client_id_APIM: <The App Registration’s Application (client) ID>

client_secret_APIM: <The secret we created earlier>

Next, create a new request, navigate to the Authorization tab and fill in the following:

Type: OAuth 2.0

Add authorization data to: Request Headers

Token: Available Tokens

Header Prefix: Bearer

Token Name: <Name of preference>

Grant type: Authorization Code

Callback URL: https://oauth.pstmn.io/v1/callback

Authorize using browser: Enabled

Auth URL: https://login.microsoftonline.com/{{tenant_id}}/oauth2/v2.0/authorize

Access Token URL: https://login.microsoftonline.com/{{tenant_id}}/oauth2/v2.0/token

Client ID: {{client_id_APIM}}

Client Secret: {{client_secret_APIM}}

Scope: api://12bccc26-b778-4a2d-ae7a-4f5732e7a79d/API.Access

Client Authentication: Send as Basic Auth header

**Note the default Callback URL is set as https://oauth.pstmn.io/v1/callback, which is the URL we configured earlier for the App Registration’s Redirect URI.

Leave the rest as default and click on Get New Access Token:

imageimage

A window with Get new access token prompt will be displayed with a browser directing you to the login.microsoftonline.com to log into Entra. Proceed to log into Entra ID to retrieve the token.

Repeat the steps for Postman as demonstrated in the Azure CLI instructions to call the OpenAI endpoints through the APIM management with the token.

----------------------------------------------------------------------------------------------------------------------------

I hope this helps anyone who may be looking for a way to lock down APIM access when publishing Azure OpenAI APIs. There are other infrastructure components that will need to be secured to ensure no calls can reach the Azure OpenAI API and I will write another blog post for the design and configuration in the future.