|
341 | 341 | "metadata": {}, |
342 | 342 | "outputs": [], |
343 | 343 | "source": [ |
344 | | - "import json\n", |
345 | | - "\n", |
346 | | - "\n", |
347 | 344 | "input_payload = json.dumps({\n", |
348 | 345 | " 'data': [\n", |
349 | 346 | " [ 0.03807591, 0.05068012, 0.06169621, 0.02187235, -0.0442235,\n", |
|
376 | 373 | "cell_type": "markdown", |
377 | 374 | "metadata": {}, |
378 | 375 | "source": [ |
379 | | - "### Model profiling\n", |
| 376 | + "### Model Profiling\n", |
380 | 377 | "\n", |
381 | | - "You can also take advantage of the profiling feature to estimate CPU and memory requirements for models.\n", |
| 378 | + "Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.\n", |
382 | 379 | "\n", |
383 | | - "```python\n", |
384 | | - "profile = Model.profile(ws, \"profilename\", [model], inference_config, test_sample)\n", |
385 | | - "profile.wait_for_profiling(True)\n", |
386 | | - "profiling_results = profile.get_results()\n", |
387 | | - "print(profiling_results)\n", |
388 | | - "```" |
| 380 | + "In order to profile your model you will need:\n", |
| 381 | + "- a registered model\n", |
| 382 | + "- an entry script\n", |
| 383 | + "- an inference configuration\n", |
| 384 | + "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n", |
| 385 | + "\n", |
| 386 | + "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n", |
| 387 | + "\n", |
| 388 | + "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent." |
| 389 | + ] |
| 390 | + }, |
| 391 | + { |
| 392 | + "cell_type": "code", |
| 393 | + "execution_count": null, |
| 394 | + "metadata": {}, |
| 395 | + "outputs": [], |
| 396 | + "source": [ |
| 397 | + "from azureml.core import Datastore\n", |
| 398 | + "from azureml.core.dataset import Dataset\n", |
| 399 | + "from azureml.data import dataset_type_definitions\n", |
| 400 | + "\n", |
| 401 | + "\n", |
| 402 | + "# create a string that can be utf-8 encoded and\n", |
| 403 | + "# put in the body of the request\n", |
| 404 | + "serialized_input_json = json.dumps({\n", |
| 405 | + " 'data': [\n", |
| 406 | + " [ 0.03807591, 0.05068012, 0.06169621, 0.02187235, -0.0442235,\n", |
| 407 | + " -0.03482076, -0.04340085, -0.00259226, 0.01990842, -0.01764613]\n", |
| 408 | + " ]\n", |
| 409 | + "})\n", |
| 410 | + "dataset_content = []\n", |
| 411 | + "for i in range(100):\n", |
| 412 | + " dataset_content.append(serialized_input_json)\n", |
| 413 | + "dataset_content = '\\n'.join(dataset_content)\n", |
| 414 | + "file_name = 'sample_request_data.txt'\n", |
| 415 | + "f = open(file_name, 'w')\n", |
| 416 | + "f.write(dataset_content)\n", |
| 417 | + "f.close()\n", |
| 418 | + "\n", |
| 419 | + "# upload the txt file created above to the Datastore and create a dataset from it\n", |
| 420 | + "data_store = Datastore.get_default(ws)\n", |
| 421 | + "data_store.upload_files(['./' + file_name], target_path='sample_request_data')\n", |
| 422 | + "datastore_path = [(data_store, 'sample_request_data' +'/' + file_name)]\n", |
| 423 | + "sample_request_data = Dataset.Tabular.from_delimited_files(\n", |
| 424 | + " datastore_path,\n", |
| 425 | + " separator='\\n',\n", |
| 426 | + " infer_column_types=True,\n", |
| 427 | + " header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)\n", |
| 428 | + "sample_request_data = sample_request_data.register(workspace=ws,\n", |
| 429 | + " name='diabetes_sample_request_data',\n", |
| 430 | + " create_new_version=True)" |
| 431 | + ] |
| 432 | + }, |
| 433 | + { |
| 434 | + "cell_type": "markdown", |
| 435 | + "metadata": {}, |
| 436 | + "source": [ |
| 437 | + "Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores." |
| 438 | + ] |
| 439 | + }, |
| 440 | + { |
| 441 | + "cell_type": "code", |
| 442 | + "execution_count": null, |
| 443 | + "metadata": {}, |
| 444 | + "outputs": [], |
| 445 | + "source": [ |
| 446 | + "from datetime import datetime\n", |
| 447 | + "\n", |
| 448 | + "\n", |
| 449 | + "environment = Environment('my-sklearn-environment')\n", |
| 450 | + "environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n", |
| 451 | + " 'azureml-defaults',\n", |
| 452 | + " 'inference-schema[numpy-support]',\n", |
| 453 | + " 'joblib',\n", |
| 454 | + " 'numpy',\n", |
| 455 | + " 'scikit-learn'\n", |
| 456 | + "])\n", |
| 457 | + "inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n", |
| 458 | + "# if cpu and memory_in_gb parameters are not provided\n", |
| 459 | + "# the model will be profiled on default configuration of\n", |
| 460 | + "# 3.5CPU and 15GB memory\n", |
| 461 | + "profile = Model.profile(ws,\n", |
| 462 | + " 'rgrsn-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),\n", |
| 463 | + " [model],\n", |
| 464 | + " inference_config,\n", |
| 465 | + " input_dataset=sample_request_data,\n", |
| 466 | + " cpu=1.0,\n", |
| 467 | + " memory_in_gb=0.5)\n", |
| 468 | + "\n", |
| 469 | + "profile.wait_for_completion(True)\n", |
| 470 | + "details = profile.get_details()" |
389 | 471 | ] |
390 | 472 | }, |
391 | 473 | { |
|
0 commit comments