forked from andypetrella/pipeline
-
Notifications
You must be signed in to change notification settings - Fork 0
Setup Cloud Environment
Chris Fregly edited this page Aug 30, 2016
·
55 revisions
- We no longer support the local laptop installation the large memory and disk footprint of this real-world environment
- A cloud instance is included in the workshop
- You do not need to set up an instance on your own
- The instructions below are provided for people who are setting this up on their own
- While not required, I would recommend choosing Ubuntu 14.04
- Typically, we use either the Amazon Web Services
r3.2xlarge
EC2 or Google Cloud Platformn1-highmem-8
GCE Cloud Instance types
- 8 Cores
- 50+ GB RAM
- 100 GB Root Volume (MUST BE ROOT VOLUME)
- Currently, these cloud instance types cost around $8-10 per day
- Later, we show how to save money by Stopping your instance - allowing you to resume your work at a later date
- Make sure all ports are open on your cloud instance
- While not secure in any way, we open all ports to make connectivity easier
- For production environments, definitely lock down these ports to the bare minimum
- Below is a screen shot of the FIREWALL RULES CHANGES required to allow all traffic into your instance
- In this example, my instances are using the "default" network which is the Google default
- You must modify these rules or you will only be able to connect to your instance on port 80 (and 443 if selected)
- Below is a screen shot of the SECURITY GROUP CHANGES required to allow all traffic into your instance
- In this example, the instance is using a security group named
fluxcapacitor
- You must modify the security group or you will only be able to connect to your instance on port 80 (and 443 if selected)
-
Create SSH Key Pair
-
Result of Associating Key Pair to a Cloud Instance at Creation Time
- NOTE: USE
-gce
or-aws
accordingly - Download the
.pem
file - Copy the downloaded
.pem
file to the/Users/<username>/.ssh
directory
mkdir -p ~/.ssh
- Change the permission on the
.pem
file so thatssh
doesn't complain
chmod 600 ~/.ssh/pipeline-training-gce.pem
- Download the
ppk
file - Copy the downloaded file to the
\Users\<username>\.ssh
directory
# TODO: Insert the Windows/DOS Commands to `mkdir` and `copy` the key
# from `\User\<username>\Downloads` to `\Users\<username>\.ssh`
- Username: pipeline-training
- Password: password9 if asked for a password
- Use SSH to log in to your Cloud Instance using the
.pem
file created from the previous step - You may have to enter the password you used when you created the key pair in an earlier step
- NOTE: USE
-gce
or-aws
accordingly
# MacOS + Linux ONLY
ssh -i ~/.ssh/pipeline-training-gce.pem pipeline-training@<your-cloud-instance-public-ip>
- Username: pipeline-training
- Password: password9 if asked for a password
- Download Putty
- Use
putty
to connect to using the downloaded.ppk
file
- Do not rely on the Docker installation that comes with your Operating System
- You will need the latest Docker per the script below which should be run at instance creation time
- GCE calls these
Init Scripts
for GCE Images - AWS calls these
User Data
for AWS AMIs
#!/bin/bash
sudo apt-get install -y curl
sudo curl 'https://bintray.com/user/downloadSubjectPublicKey?username=pcp' | sudo apt-key add -
sudo echo "deb https://dl.bintray.com/pcp/trusty trusty main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get install -y pcp pcp-webapi
curl -fsSL https://get.docker.com/ | sh
curl -fsSL https://get.docker.com/gpg | sudo apt-key add -
sudo docker pull fluxcapacitor/pipeline
- If you see the following error
Warning: failed to get default registry endpoint from daemon (Cannot
connect to the Docker daemon. Is the docker daemon running on this
host?). Using system default: https://index.docker.io/v1/
Cannot connect to the Docker daemon. Is the docker daemon running on
this host?
You are likely calling docker <something>
instead of sudo docker <something>
- More info in the Docker docs
Environment Setup
Demos
6. Serve Batch Recommendations
8. Streaming Probabilistic Algos
9. TensorFlow Image Classifier
Active Research (Unstable)
15. Kubernetes Docker Spark ML
Managing Environment
15. Stop and Start Environment