Skip to content

Commit ee70db2

Browse files
Merge pull request #368 from mjmenger/f5-app-stack-deploy
F5 app stack deploy
2 parents c085da8 + 82de21a commit ee70db2

3 files changed

Lines changed: 158 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,7 @@ Here's a guide to [use the config object in your request](https://portkey.ai/doc
320320

321321
## Deploying the AI Gateway
322322
[See docs](docs/installation-deployments.md) on installing the AI Gateway locally or deploying it on popular locations.
323+
- Deploy to [App Stack](docs/installation-deployments.md#deploy-to-app-stack)
323324
- Deploy to [Cloudflare Workers](https://github.com/Portkey-AI/gateway/blob/main/docs/installation-deployments.md#deploy-to-cloudflare-workers)
324325
- Deploy using [Docker](https://github.com/Portkey-AI/gateway/blob/main/docs/installation-deployments.md#deploy-using-docker)
325326
- Deploy using [Docker Compose](https://github.com/Portkey-AI/gateway/blob/main/docs/installation-deployments.md#deploy-using-docker-compose)

deployment.yaml

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
apiVersion: v1
2+
kind: Namespace
3+
metadata:
4+
name: portkeyai
5+
---
6+
apiVersion: apps/v1
7+
kind: Deployment
8+
metadata:
9+
name: portkeyai
10+
namespace: portkeyai
11+
spec:
12+
replicas: 1
13+
revisionHistoryLimit: 3
14+
selector:
15+
matchLabels:
16+
app: portkeyai
17+
version: v1
18+
strategy:
19+
rollingUpdate:
20+
maxSurge: 25%
21+
maxUnavailable: 25%
22+
type: RollingUpdate
23+
template:
24+
metadata:
25+
labels:
26+
app: portkeyai
27+
version: v1
28+
spec:
29+
containers:
30+
- image: portkeyai/gateway
31+
imagePullPolicy: IfNotPresent
32+
name: portkeyai
33+
ports:
34+
- containerPort: 8787
35+
protocol: TCP
36+
resources: {}
37+
dnsPolicy: ClusterFirst
38+
restartPolicy: Always
39+
schedulerName: default-scheduler
40+
securityContext: {}
41+
---
42+
apiVersion: v1
43+
kind: Service
44+
metadata:
45+
name: portkeyai
46+
namespace: portkeyai
47+
spec:
48+
ports:
49+
- port: 8787
50+
protocol: TCP
51+
targetPort: 8787
52+
selector:
53+
app: portkeyai
54+
version: v1
55+
sessionAffinity: None
56+
type: NodePort
57+
58+

docs/installation-deployments.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,105 @@ $ bunx @portkey-ai/gateway
3838

3939
<br>
4040

41+
# Deploy to App Stack
42+
F5 Distributed Cloud
43+
1. [Create an App Stack Site](https://docs.cloud.f5.com/docs/how-to/site-management/create-voltstack-site)
44+
45+
2. Retrieve the global kubeconfig
46+
```shell
47+
export DISTRIBUTED_CLOUD_TENANT=mytenantname
48+
# find tenant id in the F5 Distributed Cloud GUI at
49+
# Account -> Account Settings -> Tenant Overview -> Tenant ID
50+
export DISTRIBUTED_CLOUD_TENANT_ID=mytenantnamewithextensionfoundintheconsole
51+
# create an API token in the F5 Distributed Cloud GUI at
52+
# Account -> Account Settings -> Credentials -> Add Credentials
53+
# set Credential Type to API Token, not API Certificate
54+
export DISTRIBUTED_CLOUD_API_TOKEN=myapitoken
55+
export DISTRIBUTED_CLOUD_SITE_NAME=appstacksitename
56+
export DISTRIBUTED_CLOUD_NAMESPACE=mydistributedcloudnamespace
57+
export DISTRIBUTED_CLOUD_APP_STACK_NAMESPACE=portkeyai
58+
export DISTRIBUTED_CLOUD_APP_STACK_SITE=myappstacksite
59+
export DISTRIBUTED_CLOUD_SERVICE_NAME=portkeyai
60+
# adjust the expiry date to a time no more than 90 days in the future
61+
export KUBECONFIG_CERT_EXPIRE_DATE="2021-09-14T09:02:25.547659194Z"
62+
export PORTKEY_GATEWAY_FQDN=the.host.nameof.theservice
63+
export PORTKEY_PROVIDER=openai
64+
export PORTKEY_PROVIDER_AUTH_TOKEN=authorizationtoken
65+
66+
curl --location --request POST 'https://$DISTRIBUTED_CLOUD_TENANT.console.ves.volterra.io/api/web/namespaces/system/sites/$DISTRIBUTED_CLOUD_SITE_NAME/global-kubeconfigs' \
67+
--header 'Authorization: APIToken $DISTRIBUTED_CLOUD_API_TOKEN' \
68+
--header 'Access-Control-Allow-Origin: *' \
69+
--header 'x-volterra-apigw-tenant: $DISTRIBUTED_CLOUD_TENANT'\
70+
--data-raw '{"expirationTimestamp":"$KUBECONFIG_CERT_EXPIRE_DATE"}'
71+
```
72+
Save the response in a YAML file for later use.
73+
[more detailed instructions for retrieving the App Stack kubeconfig file](https://f5cloud.zendesk.com/hc/en-us/articles/4407917988503-How-to-download-kubeconfig-via-API-or-vesctl)
74+
75+
3. Copy the deployment YAML
76+
```shell
77+
wget https://raw.githubusercontent.com/Portkey-AI/gateway/main/deployment.yaml
78+
```
79+
80+
4. Apply the manifest
81+
```shell
82+
export KUBECONFIG=path/to/downloaded/global/kubeconfig/in/step/two
83+
# apply the file downloaded in step 3
84+
kubectl apply -f deployment.yaml
85+
```
86+
5. Create Origin Pool
87+
```shell
88+
# create origin pool
89+
curl --request POST \
90+
--url https://$DISTRIBUTED_CLOUD_TENANT.console.ves.volterra.io/api/config/namespaces/$DISTRIBUTED_CLOUD_NAMESPACE/origin_pools \
91+
--header 'authorization: APIToken $DISTRIBUTED_CLOUD_API_TOKEN' \
92+
--header 'content-type: application/json' \
93+
--data '{"metadata": {"name": "$DISTRIBUTED_CLOUD_SERVICE_NAME","namespace": "$DISTRIBUTED_CLOUD_NAMESPACE","labels": {},"annotations": {},"description": "","disable": false},"spec": {"origin_servers": [{"k8s_service": {"service_name": "$DISTRIBUTED_CLOUD_SERVICE_NAME.$DISTRIBUTED_CLOUD_APP_STACK_NAMESPACE","site_locator": {"site": {"tenant": "$DISTRIBUTED_CLOUD_TENANT_ID","namespace": "system","name": "$DISTRIBUTED_CLOUD_APP_STACK_SITE"}},"inside_network": {}},"labels": {}}],"no_tls": {},"port": 8787,"same_as_endpoint_port": {},"healthcheck": [],"loadbalancer_algorithm": "LB_OVERRIDE","endpoint_selection": "LOCAL_PREFERRED","advanced_options": null}}'
94+
```
95+
or [use the UI](https://docs.cloud.f5.com/docs/how-to/app-networking/origin-pools)
96+
5. Create an HTTP Load Balancer, including header injection of Portkey provider and credentials
97+
```shell
98+
curl --request POST \
99+
--url https://$DISTRIBUTED_CLOUD_TENANT.console.ves.volterra.io/api/config/namespaces/$DISTRIBUTED_CLOUD_NAMESPACE/http_loadbalancers \
100+
--header 'authorization: APIToken $DISTRIBUTED_CLOUD_API_TOKEN' \
101+
--header 'content-type: application/json' \
102+
--data '{"metadata": {"name": "$DISTRIBUTED_CLOUD_SERVICE_NAME","namespace": "$DISTRIBUTED_CLOUD_NAMESPACE","labels": {},"annotations": {},"description": "","disable": false},"spec": {"domains": ["$PORTKEY_GATEWAY_FQDN"],"https_auto_cert": {"http_redirect": true,"add_hsts": false,"tls_config": {"default_security": {}},"no_mtls": {},"default_header": {},"enable_path_normalize": {},"port": 443,"non_default_loadbalancer": {},"header_transformation_type": {"default_header_transformation": {}},"connection_idle_timeout": 120000,"http_protocol_options": {"http_protocol_enable_v1_v2": {}}},"advertise_on_public_default_vip": {},"default_route_pools": [{"pool": {"tenant": "$DISTRIBUTED_CLOUD_TENANT_ID","namespace": "$DISTRIBUTED_CLOUD_NAMESPACE","name": "$DISTRIBUTED_CLOUD_SERVICE_NAME"},"weight": 1,"priority": 1,"endpoint_subsets": {}}],"origin_server_subset_rule_list": null,"routes": [],"cors_policy": null,"disable_waf": {},"add_location": true,"no_challenge": {},"more_option": {"request_headers_to_add": [{"name": "x-portkey-provider","value": "$PORTKEY_PROVIDER","append": false},{"name": "Authorization","value": "Bearer $PORTKEY_PROVIDER_AUTH_TOKEN","append": false}],"request_headers_to_remove": [],"response_headers_to_add": [],"response_headers_to_remove": [],"max_request_header_size": 60,"buffer_policy": null,"compression_params": null,"custom_errors": {},"javascript_info": null,"jwt": [],"idle_timeout": 30000,"disable_default_error_pages": false,"cookies_to_modify": []},"user_id_client_ip": {},"disable_rate_limit": {},"malicious_user_mitigation": null,"waf_exclusion_rules": [],"data_guard_rules": [],"blocked_clients": [],"trusted_clients": [],"api_protection_rules": null,"ddos_mitigation_rules": [],"service_policies_from_namespace": {},"round_robin": {},"disable_trust_client_ip_headers": {},"disable_ddos_detection": {},"disable_malicious_user_detection": {},"disable_api_discovery": {},"disable_bot_defense": {},"disable_api_definition": {},"disable_ip_reputation": {},"disable_client_side_defense": {},"csrf_policy": null,"graphql_rules": [],"protected_cookies": [],"host_name": "","dns_info": [],"internet_vip_info": [],"system_default_timeouts": {},"jwt_validation": null,"disable_threat_intelligence": {},"l7_ddos_action_default": {},}}'
103+
```
104+
or [use the UI](https://docs.cloud.f5.com/docs/how-to/app-networking/http-load-balancer)
105+
6. Test the service
106+
```shell
107+
curl --request POST \
108+
--url https://$PORTKEY_GATEWAY_FQDN/v1/chat/completions \
109+
--header 'content-type: application/json' \
110+
--data '{"messages": [{"role": "user","content": "Say this might be a test."}],"max_tokens": 20,"model": "gpt-4"}'
111+
```
112+
in addition to the response headers, you should get a response body like
113+
```json
114+
{
115+
"id": "chatcmpl-abcde......09876",
116+
"object": "chat.completion",
117+
"created": "0123456789",
118+
"model": "gpt-4-0321",
119+
"choices": [
120+
{
121+
"index": 0,
122+
"message": {
123+
"role": "assistant",
124+
"content": "This might be a test."
125+
},
126+
"logprobs": null,
127+
"finish_reason": "stop"
128+
}
129+
],
130+
"usage": {
131+
"prompt_tokens": 14,
132+
"completion_tokens": 6,
133+
"total_tokens": 20
134+
},
135+
"system_fingerprint": null
136+
}
137+
```
138+
139+
41140
### Cloudflare Workers
42141

43142
1. Clone the Repository

0 commit comments

Comments
 (0)