`?`	Open keyboard shortcuts
`Ctrl/Cmd`+`K`	Open search
`Ctrl/Cmd`+`Shift`+`K`	Open tag search
`Ctrl/Cmd`+`Shift`+`G`	Open global graph
`Esc`	Close the active popup

Overview

Organizations adopting Generative AI on Azure often want two key capabilities:

Pay-as-you-go (PAYG) pricing – to avoid idle infrastructure costs associated with hosting large models.
Private network access – to ensure that AI services are not exposed over the public internet and meet security requirements.

Azure Databricks provides capabilities that partially satisfy both requirements, but there are important architectural constraints when trying to combine them.

This information sheet explains what is possible, what is not currently supported, and recommended architecture options.

TL

Azure Databricks can support both PAYG GenAI and private networking architectures, but they cannot currently be combined for Databricks-hosted foundation model APIs.

Organizations must decide whether cost efficiency or private endpoint enforcement is the primary architectural requirement.

PAYG foundation model APIs

Lowest cost

No hosting required

No private serving endpoint

Provisioned/custom model serving

Private endpoints supported

Higher cost

Requires provisioned infrastructure

Organizations should choose the architecture based on whether cost optimization or network isolation is the higher priority.

1. Using GenAI Models in Azure Databricks

Azure Databricks provides Foundation Model APIs that allow users to access large language models and other GenAI models directly from their Databricks workspace.

Foundational Model APIs documentation

These APIs enable:

Prompt-based model inference
Chat completion
Embeddings
Model evaluation and experimentation

The models are hosted and managed by Databricks, meaning customers do not need to provision GPU clusters or manage infrastructure.

Key Characteristics

Managed model hosting
Token-based usage billing
Direct integration with Databricks notebooks, jobs, and ML pipelines
Supports popular foundation models (LLMs and embedding models)

Supported models documentation

2. Pay-As-You-Go Pricing Model

Azure Databricks Foundation Model APIs support a pay-per-token pricing model, which provides a natural PAYG cost structure.

Pricing and usage documentation

Benefits

No infrastructure provisioning required
No idle GPU cost
Scales automatically based on usage
Costs are tied directly to inference volume

This approach is well suited for:

Experimental GenAI development
Low or variable traffic workloads
Internal copilots or AI assistants
Prototyping RAG pipelines

3. Private Networking in Azure Databricks

Azure Databricks supports private connectivity through Azure Private Link, enabling organizations to prevent public internet exposure of workspace resources.

Workspace Private Link

Allows secure inbound connectivity to the Databricks workspace.

Benefits:

Workspace accessible only through a private network
Integration with enterprise VNET architectures
Eliminates public endpoint access

Private Link documentation

Serverless Private Connectivity (Network Connectivity Configurations)

Databricks also provides Network Connectivity Configurations (NCC) which enable serverless Databricks compute services to securely access Azure resources through private endpoints.

Serverless Private Link documentation

Examples of resources accessed privately include:

Azure Storage
Azure SQL
Azure Key Vault
Internal enterprise APIs

4. Limitation: PAYG Foundation Models and Private Endpoints

The key limitation is related to Model Serving networking capabilities.

While Azure Databricks supports private networking in many areas, Databricks-hosted pay-per-token foundation model endpoints currently do not support private endpoint access for the model serving endpoint itself.

Private connectivity for model serving is supported only for:

Provisioned Throughput Endpoints
Custom Model Serving Endpoints

Model serving documentation

This means that the Foundation Model PAYG endpoints cannot currently be placed behind private endpoints.

5. Architecture Options

Organizations must choose between two primary architecture patterns depending on their priorities.

Option A — PAYG-Optimized Architecture

Objective

Minimize infrastructure costs while still maintaining secure access to the Databricks environment.

Architecture Components

Azure Databricks Workspace
Azure Private Link for workspace access
Foundation Model APIs (Pay-Per-Token)
Databricks notebooks or applications invoking the models

Foundation model APIs documentation

Characteristics

Pros

True PAYG model inference
No GPU hosting costs
Simple operational model

Cons

Model serving endpoint itself cannot be private-endpoint restricted
Some outbound access to Databricks-hosted APIs is required

Best For

Development environments
R&D teams
Internal productivity tools
Low-risk workloads

Option B — Private-Endpoint-First Architecture

Objective

Ensure model serving occurs entirely within private networking boundaries.

Architecture Components

Azure Databricks Workspace with Private Link
Provisioned Throughput Model Serving
Custom Model Serving endpoints
Private endpoint connectivity for serving endpoints

Model serving architecture documentation

Characteristics

Pros

Model serving endpoints can be privately exposed
Strongest security posture
Meets strict enterprise networking requirements

Cons

Requires provisioned capacity
Higher cost than PAYG
Infrastructure management required

Best For

Regulated environments
Sensitive data processing
Enterprise AI production systems
Strict network isolation policies

6. Recommendation Framework

When choosing an architecture, consider the following decision factors:

Requirement	Recommended Approach
Lowest possible cost	PAYG Foundation Model APIs
Fully private AI serving	Provisioned throughput endpoints
Prototype GenAI applications	PAYG
Enterprise production workloads	Private serving architecture
High-security environments	Private endpoints + provisioned serving

Graph View

Keyboard shortcuts

Explorer

Azure Databricks - Model Serving - PAYG & Private Networks

Overview

1. Using GenAI Models in Azure Databricks

Key Characteristics

2. Pay-As-You-Go Pricing Model

Benefits

3. Private Networking in Azure Databricks

Workspace Private Link

Serverless Private Connectivity (Network Connectivity Configurations)

4. Limitation: PAYG Foundation Models and Private Endpoints

5. Architecture Options

Option A — PAYG-Optimized Architecture

Objective

Architecture Components

Characteristics

Best For

Option B — Private-Endpoint-First Architecture

Objective

Architecture Components

Characteristics

Best For

6. Recommendation Framework

Graph View

Keyboard shortcuts

Graph View

Table of Contents