|
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | | - "## Choropleth plot with geojson data" |
| 7 | + "## Working with custom layers" |
8 | 8 | ] |
9 | 9 | }, |
10 | 10 | { |
11 | 11 | "cell_type": "markdown", |
12 | 12 | "metadata": {}, |
13 | 13 | "source": [ |
14 | | - "In this activity, we not only want work with geojson data, but also see how we can create a choropleth visualization. \n", |
15 | | - "They are espacially useful to display statistical variables in shaded areas. In our case the areas will be the outlines of the states of the USA. \n", |
| 14 | + "In this activity, we will take a look at how to create custom layers that allow you to not only display geo-spatial data but also animate your datapoints over time. \n", |
| 15 | + "We'll get a deeper understanding of how geoplotlib works and how layers are created and drawn.\n", |
16 | 16 | "\n", |
17 | | - "Note:\n", |
| 17 | + "Our dataset does not only contain spatial but also temporal information which enables us to plot flights over time on our map. \n", |
| 18 | + "There is an example on how to do this with taxis in the examples folder of geoplotlib. \n", |
| 19 | + "https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/taxi.py\n", |
| 20 | + "\n", |
| 21 | + "**Note:** \n", |
18 | 22 | "The dataset can be found here: \n", |
19 | | - "https://catalog.data.gov/dataset/national-obesity-by-state-b181b" |
| 23 | + "https://datamillnorth.org/dataset/flight-tracking" |
20 | 24 | ] |
21 | 25 | }, |
22 | 26 | { |
|
30 | 34 | "cell_type": "markdown", |
31 | 35 | "metadata": {}, |
32 | 36 | "source": [ |
33 | | - "Our dataset contains the points that define the different states and an obesity value that represents the percentage of people that are obese per state. \n", |
34 | | - "Since the geojson method of geoplotlib works by providing a file path to the geojson file, we don't have to do any importing and loading of the data. \n", |
35 | | - "\n", |
36 | | - "\n", |
37 | | - "We still want to load the dataset and look at the structure of the geojson file." |
| 37 | + "This time our dataset contains flight data recorded from different machines. \n", |
| 38 | + "Each entry is assigned to a unique plane through a `hex_ident`. \n", |
| 39 | + "Each location is related to a specific timestamp that consists of a `date` and a `time`." |
38 | 40 | ] |
39 | 41 | }, |
40 | 42 | { |
41 | 43 | "cell_type": "code", |
42 | | - "execution_count": 2, |
| 44 | + "execution_count": 1, |
43 | 45 | "metadata": {}, |
44 | 46 | "outputs": [], |
45 | 47 | "source": [ |
46 | 48 | "# importing the necessary dependencies\n", |
47 | | - "import json\n", |
48 | | - "import geoplotlib\n", |
49 | | - "from geoplotlib.colors import ColorMap\n", |
50 | | - "from geoplotlib.utils import BoundingBox" |
| 49 | + "import pandas as pd" |
51 | 50 | ] |
52 | 51 | }, |
53 | 52 | { |
54 | 53 | "cell_type": "code", |
55 | | - "execution_count": 20, |
56 | | - "metadata": {}, |
57 | | - "outputs": [ |
58 | | - { |
59 | | - "name": "stdout", |
60 | | - "output_type": "stream", |
61 | | - "text": [ |
62 | | - "{\n", |
63 | | - " \"type\": \"Feature\",\n", |
64 | | - " \"properties\": {\n", |
65 | | - " \"OBJECTID\": 1,\n", |
66 | | - " \"NAME\": \"Texas\",\n", |
67 | | - " \"Obesity\": 32.4,\n", |
68 | | - " \"Shape__Area\": 7672329221282.43,\n", |
69 | | - " \"Shape__Length\": 15408321.8693326\n", |
70 | | - " },\n", |
71 | | - " \"geometry\": {\n", |
72 | | - " \"type\": \"Polygon\",\n", |
73 | | - " \"coordinates\": [\n", |
74 | | - " -106.623454789568,\n", |
75 | | - " 31.9140391520155\n", |
76 | | - " ]\n", |
77 | | - " }\n", |
78 | | - "}\n" |
79 | | - ] |
80 | | - } |
81 | | - ], |
82 | | - "source": [ |
83 | | - "# displaying one of the entries for the states\n", |
84 | | - "with open('data/National_Obesity_By_State.geojson') as data:\n", |
85 | | - " dataset = json.load(data)\n", |
86 | | - " \n", |
87 | | - " first_state = dataset.get('features')[0]\n", |
88 | | - " \n", |
89 | | - " # only showing one coordinate instead of all points\n", |
90 | | - " first_state['geometry']['coordinates'] = first_state['geometry']['coordinates'][0][0]\n", |
91 | | - " print(json.dumps(first_state, indent=4))\n", |
92 | | - " " |
93 | | - ] |
94 | | - }, |
95 | | - { |
96 | | - "cell_type": "markdown", |
97 | | - "metadata": {}, |
98 | | - "source": [ |
99 | | - "Extracting the names of all the states provided in the geojson might also be helpful later on." |
| 54 | + "execution_count": 2, |
| 55 | + "metadata": {}, |
| 56 | + "outputs": [], |
| 57 | + "source": [ |
| 58 | + "# loading the dataset from the csv file\n" |
100 | 59 | ] |
101 | 60 | }, |
102 | 61 | { |
103 | 62 | "cell_type": "code", |
104 | | - "execution_count": 7, |
| 63 | + "execution_count": 3, |
105 | 64 | "metadata": {}, |
106 | | - "outputs": [ |
107 | | - { |
108 | | - "name": "stdout", |
109 | | - "output_type": "stream", |
110 | | - "text": [ |
111 | | - "['Texas', 'California', 'Kentucky', 'Georgia', 'Wisconsin', 'Oregon', 'Virginia', 'Tennessee', 'Louisiana', 'New York', 'Michigan', 'Idaho', 'Florida', 'Alaska', 'Montana', 'Minnesota', 'Nebraska', 'Washington', 'Ohio', 'Illinois', 'Missouri', 'Iowa', 'South Dakota', 'Arkansas', 'Mississippi', 'Colorado', 'North Carolina', 'Utah', 'Oklahoma', 'Wyoming', 'West Virginia', 'Indiana', 'Massachusetts', 'Nevada', 'Connecticut', 'District of Columbia', 'Rhode Island', 'Alabama', 'Puerto Rico', 'South Carolina', 'Maine', 'Arizona', 'New Mexico', 'Maryland', 'Delaware', 'Pennsylvania', 'Kansas', 'Vermont', 'New Jersey', 'North Dakota', 'New Hampshire']\n" |
112 | | - ] |
113 | | - } |
114 | | - ], |
| 65 | + "outputs": [], |
115 | 66 | "source": [ |
116 | | - "# listing the states in the dataset\n", |
117 | | - "with open('data/National_Obesity_By_State.geojson') as data:\n", |
118 | | - " dataset = json.load(data)\n", |
119 | | - " \n", |
120 | | - " states = [feature['properties']['NAME'] for feature in dataset.get('features')]\n", |
121 | | - " print(states)" |
| 67 | + "# displaying the first 5 rows of the dataset\n" |
| 68 | + ] |
| 69 | + }, |
| 70 | + { |
| 71 | + "cell_type": "code", |
| 72 | + "execution_count": 4, |
| 73 | + "metadata": {}, |
| 74 | + "outputs": [], |
| 75 | + "source": [ |
| 76 | + "# renaming columns latitude to lat and longitude to lon\n" |
122 | 77 | ] |
123 | 78 | }, |
124 | 79 | { |
125 | 80 | "cell_type": "markdown", |
126 | 81 | "metadata": {}, |
127 | 82 | "source": [ |
128 | | - "##### **Note:** \n", |
129 | | - "The dataset has been altered, if you download it from the link mentioned in the introduction, please edit the file and remove the object describing Hawaii. \n", |
130 | | - "It lacks geometry data which leads to an error in geoplotlib." |
| 83 | + "**Note:** \n", |
| 84 | + "Remember that geoplotlib needs columns that are named `lat` and `lon`. You will encounter an error if that is not the case." |
| 85 | + ] |
| 86 | + }, |
| 87 | + { |
| 88 | + "cell_type": "code", |
| 89 | + "execution_count": 5, |
| 90 | + "metadata": {}, |
| 91 | + "outputs": [], |
| 92 | + "source": [ |
| 93 | + "# displaying the first 5 rows of the dataset\n" |
131 | 94 | ] |
132 | 95 | }, |
133 | 96 | { |
|
141 | 104 | "cell_type": "markdown", |
142 | 105 | "metadata": {}, |
143 | 106 | "source": [ |
144 | | - "#### Creating a Choropleth with geojson data" |
| 107 | + "#### Adding an unix timestamp" |
145 | 108 | ] |
146 | 109 | }, |
147 | 110 | { |
148 | 111 | "cell_type": "markdown", |
149 | 112 | "metadata": {}, |
150 | 113 | "source": [ |
151 | | - "Use the `National_Obesity_By_State.geojson` file in the data folder to visualize the different states." |
| 114 | + "The easiest way to work with and handle time is to use a unix timestamp. \n", |
| 115 | + "In previous activities, we've already seen how to create a new column in our dataset by applying a function to it. \n", |
| 116 | + "We are using the datatime library to parse the date and time columns of our dataset and use it to create a unix timestamp." |
152 | 117 | ] |
153 | 118 | }, |
154 | 119 | { |
155 | 120 | "cell_type": "code", |
156 | | - "execution_count": 1, |
| 121 | + "execution_count": 6, |
| 122 | + "metadata": {}, |
| 123 | + "outputs": [], |
| 124 | + "source": [ |
| 125 | + "# method to convert date and time to an unix timestamp\n", |
| 126 | + "from datetime import datetime\n", |
| 127 | + "\n", |
| 128 | + "def to_epoch(date, time):\n", |
| 129 | + " try:\n", |
| 130 | + " timestamp = round(datetime.strptime('{} {}'.format(date, time), '%Y/%m/%d %H:%M:%S.%f').timestamp())\n", |
| 131 | + " return timestamp\n", |
| 132 | + " except ValueError:\n", |
| 133 | + " return round(datetime.strptime('2017/09/11 17:02:06.418', '%Y/%m/%d %H:%M:%S.%f').timestamp())" |
| 134 | + ] |
| 135 | + }, |
| 136 | + { |
| 137 | + "cell_type": "code", |
| 138 | + "execution_count": 6, |
| 139 | + "metadata": {}, |
| 140 | + "outputs": [], |
| 141 | + "source": [ |
| 142 | + "# creating a new column called timestamp with the to_epoch method applied\n" |
| 143 | + ] |
| 144 | + }, |
| 145 | + { |
| 146 | + "cell_type": "code", |
| 147 | + "execution_count": 7, |
157 | 148 | "metadata": {}, |
158 | 149 | "outputs": [], |
159 | 150 | "source": [ |
160 | | - "# plotting the information from the geojson file\n" |
| 151 | + "# displaying the first 5 rows of the dataset\n" |
| 152 | + ] |
| 153 | + }, |
| 154 | + { |
| 155 | + "cell_type": "markdown", |
| 156 | + "metadata": {}, |
| 157 | + "source": [ |
| 158 | + "**Note:** \n", |
| 159 | + "We round up the miliseconds in our `to_epoch` method since epoch is the number of seconds (not miliseconds) that have passes since January 1st 1970. \n", |
| 160 | + "Of course we loose some precision here, but we want to focus on creating our own custom layer instead of wasting a lot of time with our dataset." |
| 161 | + ] |
| 162 | + }, |
| 163 | + { |
| 164 | + "cell_type": "markdown", |
| 165 | + "metadata": {}, |
| 166 | + "source": [ |
| 167 | + "---" |
161 | 168 | ] |
162 | 169 | }, |
163 | 170 | { |
164 | 171 | "cell_type": "markdown", |
165 | 172 | "metadata": {}, |
166 | 173 | "source": [ |
167 | | - "The visualization above does not give us any kind of information about the obesity per state. It completely lacks the information we wanted to display. \n", |
168 | | - "When using choropleth plots, the shading of given areas is the important feature, in this case we'll use the obesity percentage as statistical value to determine the value of the shading.\n", |
| 174 | + "#### Writing our custom layer" |
| 175 | + ] |
| 176 | + }, |
| 177 | + { |
| 178 | + "cell_type": "markdown", |
| 179 | + "metadata": {}, |
| 180 | + "source": [ |
| 181 | + "After preparing our dataset, we can now start writing our custom layer. \n", |
| 182 | + "As mentioned at the beginning of this activity, it will be based on the taxi example of geoplotlib. \n", |
169 | 183 | "\n", |
170 | | - "Therefore we have to create a mapping function that converts the numerical value into a color.\n", |
| 184 | + "We want to have a layer `TrackLayer` that takes an argument dataset which contains `lat` and `lon` data in combination with a `timestamp`. \n", |
| 185 | + "Given this data, we want to plot each point for each timestamp on the map, creating a tail behind the newest position of the plane.\n", |
| 186 | + "The geoplotlib colorbrewer is used to give each plane a color based on their unique `hex_ident`. \n", |
| 187 | + "The view (bounding box) of our visualization will be set to the city Leeds and a text information with the current timestamp is displayed in the upper right corner." |
| 188 | + ] |
| 189 | + }, |
| 190 | + { |
| 191 | + "cell_type": "code", |
| 192 | + "execution_count": 10, |
| 193 | + "metadata": {}, |
| 194 | + "outputs": [], |
| 195 | + "source": [ |
| 196 | + "# custom layer creation\n", |
| 197 | + "import geoplotlib\n", |
| 198 | + "from geoplotlib.layers import BaseLayer\n", |
| 199 | + "from geoplotlib.core import BatchPainter\n", |
| 200 | + "from geoplotlib.colors import colorbrewer\n", |
| 201 | + "from geoplotlib.utils import epoch_to_str, BoundingBox\n", |
171 | 202 | "\n", |
172 | | - "**Note:** \n", |
173 | | - "If you're stuck, please take a look at the example provided by the library to understand how to create a custom color mapping. \n", |
174 | | - "https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/choropleth.py" |
| 203 | + "class TrackLayer(BaseLayer):\n", |
| 204 | + "\n", |
| 205 | + " # initialize class variables\n", |
| 206 | + " def __init__(self, dataset, bbox=BoundingBox.WORLD):\n", |
| 207 | + " self.view = bbox\n", |
| 208 | + " pass\n", |
| 209 | + "\n", |
| 210 | + " # implement draw routine\n", |
| 211 | + " def draw(self, proj, mouse_x, mouse_y, ui_manager):\n", |
| 212 | + " pass\n", |
| 213 | + " \n", |
| 214 | + " # bounding box that gets used when layer is created\n", |
| 215 | + " def bbox(self):\n", |
| 216 | + " return self.view" |
| 217 | + ] |
| 218 | + }, |
| 219 | + { |
| 220 | + "cell_type": "markdown", |
| 221 | + "metadata": {}, |
| 222 | + "source": [ |
| 223 | + "---" |
| 224 | + ] |
| 225 | + }, |
| 226 | + { |
| 227 | + "cell_type": "markdown", |
| 228 | + "metadata": {}, |
| 229 | + "source": [ |
| 230 | + "#### Visualization with of the custom layer" |
| 231 | + ] |
| 232 | + }, |
| 233 | + { |
| 234 | + "cell_type": "markdown", |
| 235 | + "metadata": {}, |
| 236 | + "source": [ |
| 237 | + "After creating the custom layer, using it is as simple as using any other layer in geoplotlib. \n", |
| 238 | + "We can use the `add_layer` method and pass in our custom layer class with the parameters needed." |
| 239 | + ] |
| 240 | + }, |
| 241 | + { |
| 242 | + "cell_type": "markdown", |
| 243 | + "metadata": {}, |
| 244 | + "source": [ |
| 245 | + "Our data is focused on the UK and specifically Leeds. \n", |
| 246 | + "So we want to adjust our bounding box to exactly this area." |
175 | 247 | ] |
176 | 248 | }, |
177 | 249 | { |
178 | 250 | "cell_type": "code", |
179 | | - "execution_count": 2, |
| 251 | + "execution_count": 10, |
180 | 252 | "metadata": {}, |
181 | 253 | "outputs": [], |
182 | 254 | "source": [ |
183 | | - "# converting the obesity into a color\n" |
| 255 | + "# bounding box for our view on leeds\n", |
| 256 | + "from geoplotlib.utils import BoundingBox\n", |
| 257 | + "\n", |
| 258 | + "leeds_bbox = BoundingBox(north=53.8074, west=-3, south=53.7074 , east=0)" |
184 | 259 | ] |
185 | 260 | }, |
186 | 261 | { |
187 | 262 | "cell_type": "code", |
188 | | - "execution_count": 3, |
| 263 | + "execution_count": 11, |
189 | 264 | "metadata": {}, |
190 | 265 | "outputs": [], |
191 | 266 | "source": [ |
192 | | - "# plotting the shaded states and adding another layer which plots the state outlines in white\n", |
193 | | - "# our BoundingBox should focus the USA\n" |
| 267 | + "# displaying our custom layer using add_layer\n", |
| 268 | + "from geoplotlib.utils import DataAccessObject\n" |
| 269 | + ] |
| 270 | + }, |
| 271 | + { |
| 272 | + "cell_type": "markdown", |
| 273 | + "metadata": {}, |
| 274 | + "source": [ |
| 275 | + "**Note:** \n", |
| 276 | + "In order to avoid any errors associated with the library, we have to convert our pandas dataframe to a geoplotlib DataAccessObject. \n", |
| 277 | + "The creator of geoplotlib provides a handy interface for this conversion." |
194 | 278 | ] |
195 | 279 | }, |
196 | 280 | { |
197 | 281 | "cell_type": "markdown", |
198 | 282 | "metadata": {}, |
199 | 283 | "source": [ |
| 284 | + "When looking at the upper right hand corner, we can clearly see the temporal aspect of this visualization. \n", |
| 285 | + "The first observation we make is that our data is really sparse, we sometimes only have a single data point for a plane, seldomly a whole path is drawn. \n", |
| 286 | + "\n", |
| 287 | + "Even though it is so sparse, we can already get a feeling about where the planes are flying most.\n", |
| 288 | + "\n", |
200 | 289 | "**Note:** \n", |
201 | | - "In the introduction we mentioned that geoplotlib works with a layer based system. \n", |
202 | | - "This means that we can simply stack the same or different layers on top like in the example above. " |
| 290 | + "If you're interested in what else can be achieved with this custom layer approach, there are more examples in the geoplotlib repository. \n", |
| 291 | + "- https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/follow_camera.py\n", |
| 292 | + "- https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/quadtree.py\n", |
| 293 | + "- https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/kmeans.py" |
203 | 294 | ] |
204 | 295 | } |
205 | 296 | ], |
|
0 commit comments