Pages

Showing posts with label Azure AI Services. Show all posts
Showing posts with label Azure AI Services. Show all posts

Monday, March 25, 2024

Accessing SharePoint document library with AI Search for Azure OpenAI GPT

The first quarter of this year has been insanely busy and has led to my inability to blog as much. I have been carving whatever time I have over the weekends to continue testing new Azure AI Services but couldn’t find the time to clear out my backlog of blog topics.

One of the recent tests I’ve done is to test out the in-preview feature of using Azure AI Search (previously known as Cognitive Search) to index a SharePoint Online document library. This feature has been one that I had interest in because of the vast amounts of SharePoint libraries I work with across different clients and the ability to use Azure OpenAI to tap into the libraries would be very attractive. CoPilot Studio offers such a feature where you can easily configure a SharePoint URL for a CoPilot to tap into but I still prefer Azure AI Services as I feel the flexibility that development offers provides much more creative ideas.

With the above said, the purpose of this post is to provide 2 scripts:

  1. Script to create the App Registration with the appropriate permissions to access SharePoint Online
  2. Script that will create the AI Search data source, indexer, and index

The deployment will follow the same found in the following Microsoft document: https://learn.microsoft.com/en-us/azure/search/search-howto-index-sharepoint-online

Please take the time to read the supported document formats and limitations of this feature.

Step 1 – Enable system managed assigned managed identity

This step is optional and isn’t needed for the configuration in this blog post because we’ll be including the tenant ID in the connection string of the data source but if your environment uses AI Search to access storage accounts then this will likely already be enabled.

Step 2 – Decide which permissions the indexer requires

There are advantages and disadvantages to go either way. This post will demonstrate using application permissions but the script also includes commented out code for delegated.

Step 3 - Create a Microsoft Entra application registration that will be used to access the SharePoint document library

Use the following script to create the App Registration that will configure the required permissions: https://github.com/terenceluk/Azure/blob/main/AI%20Services/SharePoint%20Online%20Indexer/Create-App-Registration.sh

Note that there does not appear to be a way for Azure CLI to configure Platform configurations so you'll need to manually perform the following after the App Registration is created:

  1. Navigate to the Authentication tab of the App Registration
  2. Set Allow public client flows to Yes then select Save.
  3. Select + Add a platform, then Mobile and desktop applications, then check https://login.microsoftonline.com/common/oauth2/nativeclient, then Configure.

Step 4 to 7 – Create SharePoint data source, indexer, index, and get properties of index

Use the following PowerShell script to create and configure the above components in the desired AI Search: https://github.com/terenceluk/Azure/blob/main/AI%20Services/SharePoint%20Online%20Indexer/Configure-AI-Search-for-SharePoint.ps1

The following components should be displayed when successfully configured:

AI Search Data Source

AI Search Indexer

AI Search Index

Test Chatbot with SharePoint Online document library data

With the AI Search configured to tap into the SharePoint Online library, we can now use the Azure Open AI Studio to test chatting with the data.

Launch Azure Open AI Studio:


Select Add your data and Add a data source:


Select Azure AI Search as the data source:


Select the appropriate subscription, AI Search service, and the index that was created.

There is also an option to customize the field mapping rather than using the default.

These two screenshots show the customization options:



For those who have watched the YouTube videos demonstrating the configuration, most of them have selected “content” for all the fields but as shown in the screenshot below, this is not allowed anymore as of March 23, 2024 because if such an attempt is made, the following error message will be displayed:

You cannot use the same column data in multiple fields

Proceeding to the Data management configuration will reveal that Semantic search is not available:

Only Keyword is available:

Review the configuration and complete the setup:

You should now be able to chat with your data:

Thoughts and Options

As noted in the beginning of the Microsoft document, this preview feature isn’t recommended for production workloads and Microsoft is very clear in the limitations section indicating:

  • If you need a SharePoint content indexing solution in a production environment, consider creating a custom connector with SharePoint Webhooks, calling Microsoft Graph API to export the data to an Azure Blob container, and then use the Azure Blob indexer for incremental indexing.

Using the indexer against an Azure Storage account opens up text embedding model capabilities that provide vector and semantic search, which would yield much better results. However, if the requirement is simply to gain some light insight into a SharePoint document library then piloting this preview feature and waiting for it to GA may be a good initiative.


Sunday, October 22, 2023

Deploy a ChatGPT service with Azure OpenAI Service in 6 minutes with PowerShell

OpenAI’s ChatGPT has been one of the most talked about services since its launch on November 30th, 2022 amongst my professional contacts as well as personal friends. What this Chat Generative Pre-trained Transformer can perform is truly remarkable and opens up so many possibilities in the future. Many of my colleagues have asked me whether I’ve tested it and why I haven’t written any blog posts since Azure released the OpenAI service preview in March 2023. The short answer is that I have performed some testing with it over the last few months but haven’t been able to commit the amount of time I want due to my busy work schedule. I finally had a bit of a breather over the past few weeks so I’ve managed to really try out the following:

  1. Pairing with Cognitive Search with a RAG (Retrieval Augmentation Generation) architecture to augment the ChatGPT LLM (Large Language Model) to add data in a Azure Storage Account
  2. Deploying front-end UI solutions for the OpenAI service
  3. Diving deep into how to secure Azure OpenAI, Cognitive Searches, and data sources with private endpoints and shared private access

It’s amazing how much material there is for #1 and #2 but not as much as I’d like for #3. There is so much Azure’s AI Services can do and I look forward to the projects to come in the following years.

The purpose of this blog post is to show just how fast and easy it is to deploy an Azure OpenAI service with a front-end UI for a private ChatGPT service where internal employees of organizations can safely enter questions with sensitive data. Microsoft is very clear on the usage of the inputs entered in the prompt (https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy):

Your prompts (inputs) and completions (outputs), your embeddings, and your training data:

  • are NOT available to other customers.
  • are NOT available to OpenAI.
  • are NOT used to improve OpenAI models.
  • are NOT used to improve any Microsoft or 3rd party products or services.
  • are NOT used for automatically improving Azure OpenAI models for your use in your resource (The models are stateless, unless you explicitly fine-tune models with your training data).
  • Your fine-tuned Azure OpenAI models are available exclusively for your use.

The Azure OpenAI Service is fully controlled by Microsoft; Microsoft hosts the OpenAI models in Microsoft’s Azure environment and the Service does NOT interact with any services operated by OpenAI (e.g. ChatGPT, or the OpenAI API).

This will put many organizations at ease as I’ve been to one too many dinner parties where I’ve heard people talk about entering data into OpenAI’s ChatGPT to write a letter to HR. I don’t even want to ask what they were entering in there and what else it has been used for.

In any case, I took some time to put together a PowerShell script that prompts for a few questions about what to name the resource group containing all the resources to be created, the name of the Azure OpenAI instance, the LLM model to use, what Azure subscription to use, and it takes care of the rest (Container App, Log Analytics Workspace, etc). I timed the duration of the script and it took 5 minutes and 32 seconds to run. Yes, I understand this is an imperative run rather than declarative. I’m a huge supporter of Infrastructure of Code but I needed something that would allow me to run in any Azure environment to quickly build a demo with all components in a Resource Group so I can easily tear it down by simply deleting the RG.

The deployment is very basic with no private endpoints as I will reserve that for a future post. Here is the simple topology:

image

With that, let’s get into it now.

Prerequisites

As of October 22, 2023, you may see the Azure OpenAI service as an option in the Azure AI Services blade but attempting to create the service will display the following message:

image

Azure OpenAI Service is currently available to customers via an application form. The selected subscription has not been enabled for use of the service and does not have quota for any pricing tiers. Click here to request access to Azure OpenAI service.

image

Clicking on the link will bring you to a Microsoft Form with questions about who you are, why you want to use the service, and what features you would want to turn on:

image

**I’ve blocked out the content in the screenshot of the form as I am unsure if posting the verbiage is in violation of Microsoft’s policy.

You’ll need to fill out the form, submit it, and receive an approval that is indicated to take up to 10 business days. My form submission took only a day but I assume this can vary so if you fill out the form intend on using the service so you don’t have to wait when you actually want to deploy.

Using a PowerShell script to deploy all the services in 6 minutes (or less)

The PowerShell script I put together can be retrieved from my GitHub repository here: https://github.com/terenceluk/Azure/blob/main/AI%20Services/Deploy-Azure-OpenAI-with-Chatbot-UI.ps1

The script is meant to be executed from the console and it will ask for the user to input:

  1. Select a subscription found in the tenant
  2. Provide a name for a new Resource Group
  3. Provide a name for the OpenAI instance
  4. Select a model from the options
image

The rest of the components such as Container App and Log Analytics will be automatically named (derived from the instance name) and deployed through the remaining script. At the end of a successful run, the browser will automatically launch and the following screen will be displayed:

imageimage

Azure Resources Deployed

All of the resources for the solution are meant to be deployed into a single resource group for ease of cleanup if it is used for a demo:

image

The following are screenshots of the resources:

image

image

image

Note that the script will not place the value of the Azure OpenAI key into the environment as a variable, rather, it will store it as a secret that the environment variable references:

image

image

I did not create a custom health probe so the one created is the default:

image

Securing the ChatGPT UI portal with authentication

One of the components I’m still working on is to use BICEP to configure the Container App with Microsoft as an identity provider so the portal would prompt the user for credentials and they are required to log in with a valid account in the tenant’s Entra ID / Azure AD before getting into the portal. If you’d like to turn this on after the script deploys the services, simply navigate to the Container App’s Authentication blade, click on Add identity provider:

image

Select Microsoft as the Identity provider:

image

You can leave the settings as default and proceed to create the identity provider:

image

This will create an App Registration in the tenant’s Entra ID / Azure AD for the Container App to authenticate the user:

image

Note that you would need to consent the Container App’s App Registration in the portal.azure.com or perform it upon first logging in:

image

Credits

I want to give a huge thanks to Mckay Wrigley (https://github.com/mckaywrigley) for developing and sharing out his chatbot-ui docker container (https://github.com/mckaywrigley/chatbot-ui) for the world to use. If you search the internet for deployment demonstrations, you are bound to see 9 of the 10 demos using his Chatbot UI. I spent quite a bit of time using Postman to interact with the Azure OpenAI service APIs and as I am not a developer, it would take me quite a bit of time to develop something half as great as Mckay’s.

Final Remarks

One of the behaviors I noticed during the creation and deletion of the services is that when an Azure Open AI instance is deleted, it is dropped into a recycling bin like location and if you decide to deploy another instance in the same name then it will fail. If you have deleted and instance and want to use the same name then use the Manage deleted resources in the Azure OpenAI blade to locate and purge the instance. From what I can tell, the purge is instant and you can proceed to redeploy a new instance with the same name.

image

I hope this provides anyone out there who is looking to test this great service offering out but haven’t had the time to get started. There are many other great posts I’d like to write about Cognitive Search and the “under the hood view” of the traffic flow but I will save that for another day. Happy chatting!