Merge pull request Azure#1556 from Azure/update-spark-notebook

harneetvirk · web-flow · commit 5cb465171eb2 · 2021-07-26T17:09:42.000-07:00
updating spark notebook
diff --git a/how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.ipynb b/how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.ipynb
@@ -2,144 +2,151 @@
   "cells": [
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "Copyright (c) Microsoft Corporation. All rights reserved.\n",
         "\n",
         "Licensed under the MIT License."
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.png)"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "# Register Spark Model and deploy as Webservice\n",
         "\n",
         "This example shows how to deploy a Webservice in step-by-step fashion:\n",
         "\n",
         " 1. Register Spark Model\n",
         " 2. Deploy Spark Model as Webservice"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "## Prerequisites\n",
         "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration](../../../configuration.ipynb) Notebook first if you haven't."
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
-      "metadata": {},
-      "outputs": [],
       "source": [
-        "# Check core SDK version number\n",
-        "import azureml.core\n",
-        "\n",
+        "# Check core SDK version number\r\n",
+        "import azureml.core\r\n",
+        "\r\n",
         "print(\"SDK version:\", azureml.core.VERSION)"
-      ]
+      ],
+      "outputs": [],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "## Initialize Workspace\n",
         "\n",
         "Initialize a workspace object from persisted configuration."
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
+      "source": [
+        "from azureml.core import Workspace\r\n",
+        "\r\n",
+        "ws = Workspace.from_config()\r\n",
+        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
+      ],
+      "outputs": [],
       "metadata": {
         "tags": [
           "create workspace"
         ]
-      },
-      "outputs": [],
-      "source": [
-        "from azureml.core import Workspace\n",
-        "\n",
-        "ws = Workspace.from_config()\n",
-        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
-      ]
+      }
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "### Register Model"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "You can add tags and descriptions to your Models. Note you need to have a `iris.model` file in the current directory. This model file is generated using [train in spark](../training/train-in-spark/train-in-spark.ipynb) notebook. The below call registers that file as a Model with the same name `iris.model` in the workspace.\n",
         "\n",
         "Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
+      "source": [
+        "from azureml.core.model import Model\r\n",
+        "\r\n",
+        "model = Model.register(model_path=\"iris.model\",\r\n",
+        "                       model_name=\"iris.model\",\r\n",
+        "                       tags={'type': \"regression\"},\r\n",
+        "                       description=\"Logistic regression model to predict iris species\",\r\n",
+        "                       workspace=ws)"
+      ],
+      "outputs": [],
       "metadata": {
         "tags": [
           "register model from file"
         ]
-      },
-      "outputs": [],
-      "source": [
-        "from azureml.core.model import Model\n",
-        "\n",
-        "model = Model.register(model_path=\"iris.model\",\n",
-        "                       model_name=\"iris.model\",\n",
-        "                       tags={'type': \"regression\"},\n",
-        "                       description=\"Logistic regression model to predict iris species\",\n",
-        "                       workspace=ws)"
-      ]
+      }
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "### Fetch Environment"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "You can now create and/or use an Environment object when deploying a Webservice. The Environment can have been previously registered with your Workspace, or it will be registered with it as a part of the Webservice deployment.\n",
         "\n",
         "In this notebook, we will be using 'AzureML-PySpark-MmlSpark-0.15', a curated environment.\n",
         "\n",
         "More information can be found in our [using environments notebook](../training/using-environments/using-environments.ipynb)."
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
-      "metadata": {},
-      "outputs": [],
       "source": [
-        "from azureml.core import Environment\n",
-        "\n",
-        "env = Environment.get(ws, name='AzureML-PySpark-MmlSpark-0.15')\n"
-      ]
+        "from azureml.core import Environment\r\n",
+        "from azureml.core.environment import SparkPackage\r\n",
+        "from azureml.core.conda_dependencies import CondaDependencies\r\n",
+        "\r\n",
+        "myenv = Environment('my-pyspark-environment')\r\n",
+        "myenv.docker.base_image = \"mcr.microsoft.com/mmlspark/release:0.15\"\r\n",
+        "myenv.inferencing_stack_version = \"latest\"\r\n",
+        "myenv.python.conda_dependencies = CondaDependencies.create(pip_packages=[\"azureml-core\",\"azureml-defaults\",\"azureml-telemetry\",\"azureml-train-restclients-hyperdrive\",\"azureml-train-core\"], python_version=\"3.6.2\")\r\n",
+        "myenv.python.conda_dependencies.add_channel(\"conda-forge\")\r\n",
+        "myenv.spark.packages = [SparkPackage(\"com.microsoft.ml.spark\", \"mmlspark_2.11\", \"0.15\"), SparkPackage(\"com.microsoft.azure\", \"azure-storage\", \"2.0.0\"), SparkPackage(\"org.apache.hadoop\", \"hadoop-azure\", \"2.7.0\")]\r\n",
+        "myenv.spark.repositories = [\"https://mmlspark.azureedge.net/maven\"]\r\n"
+      ],
+      "outputs": [],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "## Create Inference Configuration\n",
         "\n",
@@ -157,109 +164,109 @@
         " - source_directory = holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
         " - entry_script = contains logic specific to initializing your model and running predictions\n",
         " - environment = An environment object to use for the deployment. Doesn't have to be registered"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
+      "source": [
+        "from azureml.core.model import InferenceConfig\r\n",
+        "\r\n",
+        "inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
+      ],
+      "outputs": [],
       "metadata": {
         "tags": [
           "create image"
         ]
-      },
-      "outputs": [],
-      "source": [
-        "from azureml.core.model import InferenceConfig\n",
-        "\n",
-        "inference_config = InferenceConfig(entry_script=\"score.py\", environment=env)"
-      ]
+      }
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "### Deploy Model as Webservice on Azure Container Instance\n",
         "\n",
         "Note that the service creation can take few minutes."
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
+      "source": [
+        "from azureml.core.webservice import AciWebservice, Webservice\r\n",
+        "from azureml.exceptions import WebserviceException\r\n",
+        "\r\n",
+        "deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)\r\n",
+        "aci_service_name = 'aciservice1'\r\n",
+        "\r\n",
+        "try:\r\n",
+        "    # if you want to get existing service below is the command\r\n",
+        "    # since aci name needs to be unique in subscription deleting existing aci if any\r\n",
+        "    # we use aci_service_name to create azure aci\r\n",
+        "    service = Webservice(ws, name=aci_service_name)\r\n",
+        "    if service:\r\n",
+        "        service.delete()\r\n",
+        "except WebserviceException as e:\r\n",
+        "    print()\r\n",
+        "\r\n",
+        "service = Model.deploy(ws, aci_service_name, [model], inference_config, deployment_config)\r\n",
+        "\r\n",
+        "service.wait_for_deployment(True)\r\n",
+        "print(service.state)"
+      ],
+      "outputs": [],
       "metadata": {
         "tags": [
           "azuremlexception-remarks-sample"
         ]
-      },
-      "outputs": [],
-      "source": [
-        "from azureml.core.webservice import AciWebservice, Webservice\n",
-        "from azureml.exceptions import WebserviceException\n",
-        "\n",
-        "deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)\n",
-        "aci_service_name = 'aciservice1'\n",
-        "\n",
-        "try:\n",
-        "    # if you want to get existing service below is the command\n",
-        "    # since aci name needs to be unique in subscription deleting existing aci if any\n",
-        "    # we use aci_service_name to create azure aci\n",
-        "    service = Webservice(ws, name=aci_service_name)\n",
-        "    if service:\n",
-        "        service.delete()\n",
-        "except WebserviceException as e:\n",
-        "    print()\n",
-        "\n",
-        "service = Model.deploy(ws, aci_service_name, [model], inference_config, deployment_config)\n",
-        "\n",
-        "service.wait_for_deployment(True)\n",
-        "print(service.state)"
-      ]
+      }
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "#### Test web service"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
-      "metadata": {},
-      "outputs": [],
       "source": [
-        "import json\n",
-        "test_sample = json.dumps({'features':{'type':1,'values':[4.3,3.0,1.1,0.1]},'label':2.0})\n",
-        "\n",
-        "test_sample_encoded = bytes(test_sample, encoding='utf8')\n",
-        "prediction = service.run(input_data=test_sample_encoded)\n",
+        "import json\r\n",
+        "test_sample = json.dumps({'features':{'type':1,'values':[4.3,3.0,1.1,0.1]},'label':2.0})\r\n",
+        "\r\n",
+        "test_sample_encoded = bytes(test_sample, encoding='utf8')\r\n",
+        "prediction = service.run(input_data=test_sample_encoded)\r\n",
         "print(prediction)"
-      ]
+      ],
+      "outputs": [],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "#### Delete ACI to clean up"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "code",
       "execution_count": null,
+      "source": [
+        "service.delete()"
+      ],
+      "outputs": [],
       "metadata": {
         "tags": [
           "deploy service",
           "aci"
         ]
-      },
-      "outputs": [],
-      "source": [
-        "service.delete()"
-      ]
+      }
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "### Model Profiling\n",
         "\n",
@@ -271,11 +278,11 @@
         "profiling_results = profile.get_results()\n",
         "print(profiling_results)\n",
         "```"
-      ]
+      ],
+      "metadata": {}
     },
     {
       "cell_type": "markdown",
-      "metadata": {},
       "source": [
         "### Model Packaging\n",
         "\n",
@@ -296,7 +303,8 @@
         "package.wait_for_creation(show_output=True)\n",
         "package.save(\"./local_context_dir\")\n",
         "```"
-      ]
+      ],
+      "metadata": {}
     }
   ],
   "metadata": {