Skip to content

Commit bf6e3bd

Browse files
author
EC2 Default User
committed
begin .py development
1 parent 8449687 commit bf6e3bd

File tree

5 files changed

+591
-9
lines changed

5 files changed

+591
-9
lines changed
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Readme.md
2+
3+
Using Amazon Sagemaker to ease training, evaluating, and deployment of MXNet models.
4+
5+
## 1. Model Research
6+
Define and train model as usual in Jupter Notebook to ensure it works.
7+
8+
## 2. Model Development for SageMaker
9+
Place model definition and training code into this [Sagemaker .py template](http://docs.aws.amazon.com/sagemaker/latest/dg/mxnet-training-inference-code-template.html)
10+
11+
## 3. Model Training and Evaluation using SageMaker
12+
13+
14+
## 4. Model Deployment using SageMaker
15+

post-sagemaker/part1_instructions.md

Lines changed: 0 additions & 9 deletions
This file was deleted.
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Using AWS for CRISP-DM Phases 3-5: Data Prepartion, Modeling, and Evaluation"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
"Now that you are inside a Jupyter Notebook, we assume that most of you are within familiar territory. As such, this tutorial will not go into detail about these phases. Rather, we'll quickly breeze through these three phases with a focus on productionalizing this code into Sagemaker. In the next steps, we'll provide more detail on how to deploy real-time models using Sagemaker's SDK.\n",
15+
"\n",
16+
"Because this tutorial is focused on Sagemaker rather than the Data Science, we'll use a common dataset, MNIST, and train an image classifier using MXNet."
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"metadata": {},
22+
"source": [
23+
"## 1. Standard CRISP-DM Phases 3-5"
24+
]
25+
},
26+
{
27+
"cell_type": "markdown",
28+
"metadata": {},
29+
"source": [
30+
"As mentioned above, this tutorial will not focus on the Data Science of the modeling. As such, the following section training and evaluation code is 90% taken from https://mxnet.incubator.apache.org/tutorials/python/mnist.html\n",
31+
"\n",
32+
"If you are familiar with MXNet and the standard training and evaluation code, feel free to jump ahead."
33+
]
34+
},
35+
{
36+
"cell_type": "markdown",
37+
"metadata": {},
38+
"source": [
39+
"### Phase 3: Data Preparation"
40+
]
41+
},
42+
{
43+
"cell_type": "markdown",
44+
"metadata": {},
45+
"source": [
46+
"#### Load Data"
47+
]
48+
},
49+
{
50+
"cell_type": "code",
51+
"execution_count": 1,
52+
"metadata": {},
53+
"outputs": [
54+
{
55+
"name": "stderr",
56+
"output_type": "stream",
57+
"text": [
58+
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py:46: DeprecationWarning: OpenSSL.rand is deprecated - you should use os.urandom instead\n",
59+
" import OpenSSL.SSL\n"
60+
]
61+
}
62+
],
63+
"source": [
64+
"import mxnet as mx\n",
65+
"\n",
66+
"mnist = mx.test_utils.get_mnist()"
67+
]
68+
},
69+
{
70+
"cell_type": "markdown",
71+
"metadata": {},
72+
"source": [
73+
"#### Split training/test sets"
74+
]
75+
},
76+
{
77+
"cell_type": "code",
78+
"execution_count": 2,
79+
"metadata": {
80+
"collapsed": true
81+
},
82+
"outputs": [],
83+
"source": [
84+
"ntrain = int(mnist['train_data'].shape[0]*0.8)\n",
85+
"X_train = mnist['train_data'][:ntrain]\n",
86+
"y_train = mnist['train_label'][:ntrain]\n",
87+
"X_test = mnist['train_data'][ntrain:]\n",
88+
"y_test = mnist['train_label'][ntrain:]"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"metadata": {},
94+
"source": [
95+
"### Phase 4: Model Training"
96+
]
97+
},
98+
{
99+
"cell_type": "markdown",
100+
"metadata": {},
101+
"source": [
102+
"#### Define Network"
103+
]
104+
},
105+
{
106+
"cell_type": "code",
107+
"execution_count": 3,
108+
"metadata": {
109+
"collapsed": true
110+
},
111+
"outputs": [],
112+
"source": [
113+
"data = mx.sym.var('data')\n",
114+
"# first conv layer\n",
115+
"conv1 = mx.sym.Convolution(data=data, kernel=(5,5), num_filter=20)\n",
116+
"tanh1 = mx.sym.Activation(data=conv1, act_type=\"tanh\")\n",
117+
"pool1 = mx.sym.Pooling(data=tanh1, pool_type=\"max\", kernel=(2,2), stride=(2,2))\n",
118+
"# second conv layer\n",
119+
"conv2 = mx.sym.Convolution(data=pool1, kernel=(5,5), num_filter=50)\n",
120+
"tanh2 = mx.sym.Activation(data=conv2, act_type=\"tanh\")\n",
121+
"pool2 = mx.sym.Pooling(data=tanh2, pool_type=\"max\", kernel=(2,2), stride=(2,2))\n",
122+
"# first fullc layer\n",
123+
"flatten = mx.sym.flatten(data=pool2)\n",
124+
"fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=500)\n",
125+
"tanh3 = mx.sym.Activation(data=fc1, act_type=\"tanh\")\n",
126+
"# second fullc\n",
127+
"fc2 = mx.sym.FullyConnected(data=tanh3, num_hidden=10)\n",
128+
"# softmax loss\n",
129+
"lenet = mx.sym.SoftmaxOutput(data=fc2, name='softmax')"
130+
]
131+
},
132+
{
133+
"cell_type": "markdown",
134+
"metadata": {},
135+
"source": [
136+
"#### Train Network"
137+
]
138+
},
139+
{
140+
"cell_type": "code",
141+
"execution_count": 5,
142+
"metadata": {},
143+
"outputs": [
144+
{
145+
"ename": "KeyboardInterrupt",
146+
"evalue": "",
147+
"output_type": "error",
148+
"traceback": [
149+
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
150+
"\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)",
151+
"\u001b[0;32m<ipython-input-5-14e629e52fa0>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 16\u001b[0m \u001b[0meval_metric\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'acc'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0mbatch_end_callback\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcallback\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mSpeedometer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbatch_size\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m100\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 18\u001b[0;31m num_epoch=10)\n\u001b[0m",
152+
"\u001b[0;32m~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/base_module.py\u001b[0m in \u001b[0;36mfit\u001b[0;34m(self, train_data, eval_data, eval_metric, epoch_end_callback, batch_end_callback, kvstore, optimizer, optimizer_params, eval_end_callback, eval_batch_end_callback, initializer, arg_params, aux_params, allow_missing, force_rebind, force_init, begin_epoch, num_epoch, validation_metric, monitor)\u001b[0m\n\u001b[1;32m 494\u001b[0m \u001b[0mend_of_batch\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 495\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 496\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate_metric\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0meval_metric\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdata_batch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlabel\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 497\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 498\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mmonitor\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
153+
"\u001b[0;32m~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/module.py\u001b[0m in \u001b[0;36mupdate_metric\u001b[0;34m(self, eval_metric, labels)\u001b[0m\n\u001b[1;32m 733\u001b[0m \u001b[0mTypically\u001b[0m\u001b[0;31m \u001b[0m\u001b[0;31m`\u001b[0m\u001b[0;31m`\u001b[0m\u001b[0mdata_batch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlabel\u001b[0m\u001b[0;31m`\u001b[0m\u001b[0;31m`\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 734\u001b[0m \"\"\"\n\u001b[0;32m--> 735\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_exec_group\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate_metric\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0meval_metric\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabels\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 736\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 737\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_sync_params_from_devices\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
154+
"\u001b[0;32m~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/executor_group.py\u001b[0m in \u001b[0;36mupdate_metric\u001b[0;34m(self, eval_metric, labels)\u001b[0m\n\u001b[1;32m 580\u001b[0m \u001b[0mlabels_\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mOrderedDict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlabel_names\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabels_slice\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 581\u001b[0m \u001b[0mpreds\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mOrderedDict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0moutput_names\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtexec\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0moutputs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 582\u001b[0;31m \u001b[0meval_metric\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate_dict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlabels_\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpreds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 583\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 584\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_bind_ith_exec\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdata_shapes\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabel_shapes\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mshared_group\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
155+
"\u001b[0;32m~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/metric.py\u001b[0m in \u001b[0;36mupdate_dict\u001b[0;34m(self, label, pred)\u001b[0m\n\u001b[1;32m 106\u001b[0m \u001b[0mlabel\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 107\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 108\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpred\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 109\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 110\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabels\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpreds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
156+
"\u001b[0;32m~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/metric.py\u001b[0m in \u001b[0;36mupdate\u001b[0;34m(self, labels, preds)\u001b[0m\n\u001b[1;32m 391\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mpred_label\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mlabel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 392\u001b[0m \u001b[0mpred_label\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mndarray\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0margmax\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpred_label\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 393\u001b[0;31m \u001b[0mpred_label\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpred_label\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0masnumpy\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mastype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'int32'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 394\u001b[0m \u001b[0mlabel\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlabel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0masnumpy\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mastype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'int32'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 395\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
157+
"\u001b[0;32m~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py\u001b[0m in \u001b[0;36masnumpy\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 1550\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhandle\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1551\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mctypes\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata_as\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mctypes\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mc_void_p\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1552\u001b[0;31m ctypes.c_size_t(data.size)))\n\u001b[0m\u001b[1;32m 1553\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1554\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
158+
"\u001b[0;31mKeyboardInterrupt\u001b[0m: "
159+
]
160+
}
161+
],
162+
"source": [
163+
"# define training batch size\n",
164+
"batch_size = 100\n",
165+
"\n",
166+
"# create iterator around training and validation data\n",
167+
"train_iter = mx.io.NDArrayIter(mnist['train_data'][:ntrain], mnist['train_label'][:ntrain], batch_size, shuffle=True)\n",
168+
"val_iter = mx.io.NDArrayIter(mnist['train_data'][ntrain:], mnist['train_label'][ntrain:], batch_size)\n",
169+
"\n",
170+
"# create a trainable module\n",
171+
"# toggle this between mx.cpu() and mx.gpu() depending on if you're using ml.c-family or ml.p-family for this notebook.\n",
172+
"lenet_model = mx.mod.Module(symbol=lenet, context=mx.cpu())\n",
173+
"# train with the same\n",
174+
"lenet_model.fit(train_iter,\n",
175+
" eval_data=val_iter,\n",
176+
" optimizer='sgd',\n",
177+
" optimizer_params={'learning_rate':0.1},\n",
178+
" eval_metric='acc',\n",
179+
" batch_end_callback = mx.callback.Speedometer(batch_size, 100),\n",
180+
" num_epoch=10)"
181+
]
182+
},
183+
{
184+
"cell_type": "markdown",
185+
"metadata": {},
186+
"source": [
187+
"As you have probabily noticed by now, this step can take awhile since it's being trained locally on the instance currently used to host Jupyter.\n",
188+
"\n",
189+
"Instead, since we know that this code works, let's go to the next step and refactor the above code into the Sagemaker SDK. This allows us to use Sagemaker's distributed training capabilities to drastically speed up training time.\n",
190+
"\n",
191+
"In the instructions, please move on to Part2: Model Development"
192+
]
193+
}
194+
],
195+
"metadata": {
196+
"kernelspec": {
197+
"display_name": "conda_mxnet_p36",
198+
"language": "python",
199+
"name": "conda_mxnet_p36"
200+
},
201+
"language_info": {
202+
"codemirror_mode": {
203+
"name": "ipython",
204+
"version": 3
205+
},
206+
"file_extension": ".py",
207+
"mimetype": "text/x-python",
208+
"name": "python",
209+
"nbconvert_exporter": "python",
210+
"pygments_lexer": "ipython3",
211+
"version": "3.6.2"
212+
}
213+
},
214+
"nbformat": 4,
215+
"nbformat_minor": 2
216+
}

0 commit comments

Comments
 (0)