bge-m3
Text Embeddings • baaiMulti-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.
| Model Info | |
|---|---|
| Batch | Yes | 
| Unit Pricing | $0.012 per M input tokens | 
Usage
Workers - TypeScript
  export interface Env {  AI: Ai;}
export default {  async fetch(request, env): Promise<Response> {
    // Can be a string or array of strings]    const stories = [      "This is a story about an orange cloud",      "This is a story about a llama",      "This is a story about a hugging emoji",    ];
    const embeddings = await env.AI.run(      "@cf/baai/bge-m3",      {        text: stories,      }    );
    return Response.json(embeddings);  },} satisfies ExportedHandler<Env>;Python
  import osimport requests
ACCOUNT_ID = "your-account-id"AUTH_TOKEN = os.environ.get("CLOUDFLARE_AUTH_TOKEN")
stories = [  'This is a story about an orange cloud',  'This is a story about a llama',  'This is a story about a hugging emoji']
response = requests.post(  f"https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/baai/bge-m3",  headers={"Authorization": f"Bearer {AUTH_TOKEN}"},  json={"text": stories})
print(response.json())curl
  curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/baai/bge-m3  \  -X POST  \  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \  -d '{ "text": ["This is a story about an orange cloud", "This is a story about a llama", "This is a story about a hugging emoji"] }'Parameters
* indicates a required field
Input
-  0object-  querystring min 1A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts 
-  contexts *arrayList of provided contexts. Note that the index in this array is important, as the response will refer to it. -  itemsobject-  textstring min 1One of the provided context content 
 
-  
 
-  
-  truncate_inputsbooleanWhen provided with too long context should the model error out or truncate the context to fit? 
 
-  
-  1object-  text *one of-  0string min 1The text to embed 
-  1arrayBatch of text values to embed -  itemsstring min 1The text to embed 
 
-  
 
-  
-  truncate_inputsbooleanWhen provided with too long context should the model error out or truncate the context to fit? 
 
-  
-  2object-  requests *arrayBatch of the embeddings requests to run using async-queue -  itemsone of-  0object-  querystring min 1A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts 
-  contexts *arrayList of provided contexts. Note that the index in this array is important, as the response will refer to it. -  itemsobject-  textstring min 1One of the provided context content 
 
-  
 
-  
-  truncate_inputsbooleanWhen provided with too long context should the model error out or truncate the context to fit? 
 
-  
-  1object-  text *one of-  0string min 1The text to embed 
-  1arrayBatch of text values to embed -  itemsstring min 1The text to embed 
 
-  
 
-  
-  truncate_inputsbooleanWhen provided with too long context should the model error out or truncate the context to fit? 
 
-  
 
-  
 
-  
 
-  
Output
-  0object-  responsearray-  itemsobject-  idintegerIndex of the context in the request 
-  scorenumberScore of the context under the index. 
 
-  
 
-  
 
-  
-  1object-  responsearray-  itemsarray-  itemsnumber
 
-  
 
-  
-  shapearray-  itemsnumber
 
-  
-  poolingstringThe pooling method used in the embedding process. 
 
-  
-  2object-  shapearray-  itemsnumber
 
-  
-  dataarrayEmbeddings of the requested text values -  itemsarrayFloating point embedding representation shaped by the embedding model -  itemsnumber
 
-  
 
-  
-  poolingstringThe pooling method used in the embedding process. 
 
-  
-  3object-  request_idstringThe async request id that can be used to obtain the results. 
 
-  
API Schemas
The following schemas are based on JSON Schema
{    "type": "object",    "oneOf": [        {            "title": "BGE M3 Input Query and Contexts",            "properties": {                "query": {                    "type": "string",                    "minLength": 1,                    "description": "A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts"                },                "contexts": {                    "type": "array",                    "items": {                        "type": "object",                        "properties": {                            "text": {                                "type": "string",                                "minLength": 1,                                "description": "One of the provided context content"                            }                        }                    },                    "description": "List of provided contexts. Note that the index in this array is important, as the response will refer to it."                },                "truncate_inputs": {                    "type": "boolean",                    "default": false,                    "description": "When provided with too long context should the model error out or truncate the context to fit?"                }            },            "required": [                "contexts"            ]        },        {            "title": "BGE M3 Input Embedding",            "properties": {                "text": {                    "oneOf": [                        {                            "type": "string",                            "description": "The text to embed",                            "minLength": 1                        },                        {                            "type": "array",                            "description": "Batch of text values to embed",                            "items": {                                "type": "string",                                "description": "The text to embed",                                "minLength": 1                            },                            "maxItems": 100                        }                    ]                },                "truncate_inputs": {                    "type": "boolean",                    "default": false,                    "description": "When provided with too long context should the model error out or truncate the context to fit?"                }            },            "required": [                "text"            ]        },        {            "properties": {                "requests": {                    "type": "array",                    "description": "Batch of the embeddings requests to run using async-queue",                    "items": {                        "type": "object",                        "oneOf": [                            {                                "title": "BGE M3 Input Query and Contexts",                                "properties": {                                    "query": {                                        "type": "string",                                        "minLength": 1,                                        "description": "A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts"                                    },                                    "contexts": {                                        "type": "array",                                        "items": {                                            "type": "object",                                            "properties": {                                                "text": {                                                    "type": "string",                                                    "minLength": 1,                                                    "description": "One of the provided context content"                                                }                                            }                                        },                                        "description": "List of provided contexts. Note that the index in this array is important, as the response will refer to it."                                    },                                    "truncate_inputs": {                                        "type": "boolean",                                        "default": false,                                        "description": "When provided with too long context should the model error out or truncate the context to fit?"                                    }                                },                                "required": [                                    "contexts"                                ]                            },                            {                                "title": "BGE M3 Input Embedding",                                "properties": {                                    "text": {                                        "oneOf": [                                            {                                                "type": "string",                                                "description": "The text to embed",                                                "minLength": 1                                            },                                            {                                                "type": "array",                                                "description": "Batch of text values to embed",                                                "items": {                                                    "type": "string",                                                    "description": "The text to embed",                                                    "minLength": 1                                                },                                                "maxItems": 100                                            }                                        ]                                    },                                    "truncate_inputs": {                                        "type": "boolean",                                        "default": false,                                        "description": "When provided with too long context should the model error out or truncate the context to fit?"                                    }                                },                                "required": [                                    "text"                                ]                            }                        ]                    }                }            },            "required": [                "requests"            ]        }    ]}{    "type": "object",    "contentType": "application/json",    "oneOf": [        {            "title": "BGE M3 Ouput Query",            "properties": {                "response": {                    "type": "array",                    "items": {                        "type": "object",                        "properties": {                            "id": {                                "type": "integer",                                "description": "Index of the context in the request"                            },                            "score": {                                "type": "number",                                "description": "Score of the context under the index."                            }                        }                    }                }            }        },        {            "title": "BGE M3 Output Embedding for Contexts",            "properties": {                "response": {                    "type": "array",                    "items": {                        "type": "array",                        "items": {                            "type": "number"                        }                    }                },                "shape": {                    "type": "array",                    "items": {                        "type": "number"                    }                },                "pooling": {                    "type": "string",                    "enum": [                        "mean",                        "cls"                    ],                    "description": "The pooling method used in the embedding process."                }            }        },        {            "title": "BGE M3 Ouput Embedding",            "properties": {                "shape": {                    "type": "array",                    "items": {                        "type": "number"                    }                },                "data": {                    "type": "array",                    "description": "Embeddings of the requested text values",                    "items": {                        "type": "array",                        "description": "Floating point embedding representation shaped by the embedding model",                        "items": {                            "type": "number"                        }                    }                },                "pooling": {                    "type": "string",                    "enum": [                        "mean",                        "cls"                    ],                    "description": "The pooling method used in the embedding process."                }            }        },        {            "type": "object",            "contentType": "application/json",            "title": "Async response",            "properties": {                "request_id": {                    "type": "string",                    "description": "The async request id that can be used to obtain the results."                }            }        }    ]}Was this helpful?
- Resources
- API
- New to Cloudflare?
- Products
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark