Skip to content

Commit 402f393

Browse files
committed
Adding all final files to repo
1 parent a4144c9 commit 402f393

21 files changed

+5234
-158
lines changed
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
Documentation for Crop Classification with Multi-Temporal Satellite Imagery
2+
Author: Rose Rustowicz
3+
21 December 2017
4+
5+
Directories:
6+
classifiers -- contains code for classifiers used (softmax regression, SVMs, Neural Network, Convolutional Neural Network)
7+
dataset_construction -- used to download and construct dataset
8+
evaluation -- used to evaluate further results (most evaluation is done within classifiers)
9+
10+
1.) To get the data, use 'dataset_construction/get_crop_data.ipynb'. Given an area of interest (AOI) (specified in a .geojson file), this notebook will take you through querying the Planet API, activating and downloading scenes, and clipping downloaded scenes to an AOI.
11+
12+
2.) To download the Crop Data Layer labels, go to 'https://nassgeodata.gmu.edu/CropScape/'. In the top bar, select the US map icon (filled with the US flag), or the icon to the right of that one, which you can use to manually select an area of interest. Select a region on tha map, making sure that your AOI is within the specified region. Click on the right-most icon (the folder with a green arrow) on the top bar. Download the selected AOI.
13+
14+
3.) Now that you have downloaded data from CropScape, you need to clip it to the same AOI as in the imagery (from step 1). Open either 'dataset_construction/Crop_CDL_AOIs.ipynb' or 'dataset_construction/clip_CDL.py'. Specify the path to the downloaded CDL labels (from step 2) as 'CDL_fname' in the first line of the notebook. Specify the AOI filename in the second line of the notebook. This should be the same AOI used in step 1. This clipped CDL labels should be saved to the current directory.
15+
16+
4.) You will not create data from the clipped imagery. Open 'dataset_construction/clips_to_datacube.py'. Make sure that all of the clipped imagery from step 1 is saved to a folder, and input that folder as 'imgs_dir' under the first line of the 'main()' function. Specify the filename of the clipped data labels from step 3 as 'labels_fname'. The output will be five files, one datacube for each spectral band: 'b_time.npy', 'g_time.npy', 'r_time.npy', 're_time.npy', and 'nir_time.npy'.
17+
18+
5.) From the large selected scene, we will select the dataset. This step requires some small investigation in order to select the classes for your classification problem. You will be using 'dataset_construction/mk_dataset.py'. Open up this file. Starting in the first line of the 'main' function, specify the filenames (and necessary relative paths) of the files created in step 4. The 'sort_crops' function will print out the top 20 crops in the dataset. It is your decision on which crops to keep. The crop types corresponding to the 'sorted_crops' values can be found in the CDL database, for example here: https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/metadata_ca16.htm, where the values correspond to the 'Attribute Code' column. Select which indices of the top 20 crops you want to keep, and specify those indices within the 'get_masks' function when defining the 'final_mask_xl' variable. Within the 'concat_features' function, you will also need to change the indices of the masks for each crop type. And within 'get_labels', you will need to change the label numbers from Attribute Codes into integer values starting from 0. 'crop_dataset' takes 100000 examples from each class and adds them to the dataset. Feel free to change this number. You now have data and labels for training, validation, and testing.
19+
20+
6.) Now that your data is ready, you can run the classifiers! Go to the 'classifiers/' directory. You will find code for multiclass logistic regression ('softmax_sklearn.py'), support vector machines ('svms_sklearn.py'), a simple neural network ('NN_keras.py'), and a simple Convolutional neural network ('CNN_keras.py'). Make sure that you have the ability to use both scikit learn and keras on the computer you are using. You may need to tune the parameters in each of the classification methods.
21+
22+
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
For Scene #1, 15 timestamps were used from Rapid Eye.
2+
3+
The defined polygon is ...
4+
POLYGON((-119.8103+35.9859,-119.6456+35.9859,-119.6456+36.1363,-119.8103+36.1363,-119.8103+35.9859))
5+
6+
Dates used for Scene #1:
7+
1155704_2016-02-07_RE4_3A
8+
1155704_2016-02-24_RE1_3A
9+
1155704_2016-03-01_RE3_3A
10+
1155704_2016-04-04_RE4_3A
11+
1155704_2016-05-23_RE5_3A
12+
1155704_2016-06-17_RE1_3A
13+
1155704_2016-07-18_RE4_3A #used for mono-temporal case
14+
1155704_2016-08-02_RE5_3A
15+
1155704_2016-08-05_RE3_3A
16+
1155704_2016-08-23_RE2_3A
17+
1155704_2016-09-06_RE1_3A
18+
1155704_2016-09-25_RE1_3A
19+
1155704_2016-10-25_RE3_3A
20+
1155704_2016-11-17_RE2_3A
21+
1155704_2016-12-27_RE4_3A
22+
23+
24+
For Scene #2, 21 timestamps were used from Rapid Eye.
25+
26+
The defined polygon is ...
27+
POLYGON((-119.6147+36.2116,-119.5298+36.2116,-119.5298+36.2981,-119.6147+36.2981,-119.6147+36.2116))
28+
29+
Dates used for Scene #2:
30+
1155805_2016-02-07_RE4_3A
31+
1155805_2016-02-24_RE1_3A
32+
1155805_2016-03-01_RE3_3A
33+
1155805_2016-04-04_RE4_3A
34+
1155805_2016-04-27_RE3_3A
35+
1155805_2016-05-15_RE2_3A
36+
1155805_2016-05-23_RE5_3A
37+
1155805_2016-06-17_RE1_3A
38+
1155805_2016-07-18_RE4_3A #used for mono-temporal case
39+
1155805_2016-08-23_RE2_3A
40+
1155805_2016-08-24_RE3_3A
41+
1155805_2016-09-06_RE1_3A
42+
1155805_2016-09-08_RE4_3A
43+
1155805_2016-09-09_RE5_3A
44+
1155805_2016-09-25_RE1_3A
45+
1155805_2016-10-01_RE3_3A
46+
1155805_2016-10-06_RE3_3A
47+
1155805_2016-10-25_RE3_3A
48+
1155805_2016-11-13_RE3_3A
49+
1155805_2016-11-17_RE2_3A
50+
1155805_2016-12-27_RE4_3A
Binary file not shown.
Binary file not shown.
Binary file not shown.

jupyter-notebooks/temporal-crop-classification/classifiers/CNN_keras.py

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
# Implementation of a CNN using Keras
1+
# Implementation of a CNN using Keras. Tuning of hyperparameters may be needed for
2+
# best performance
23

34
import numpy as np
45
import pdb
@@ -57,19 +58,19 @@ def evaluate_loaded_model(loaded_model, X_test, y_test):
5758
print('%s: %.2f%%' % (loaded_model.metrics_names[1], score[1]*100))
5859

5960
def create_model(num_classes):
60-
# Define the model with the Keras Sequential modeling framework
61+
# Define the model with the Keras Sequential modeling framework. Number of input features and output classes
62+
# must be changed depending on what dataset you are working with
6163
model = Sequential()
62-
63-
#model.add(Conv1D(32, kernel_size=5, strides=1, activation='relu', input_shape=(105,1)))
64-
model.add(Conv1D(8, kernel_size=3, strides=1, activation='relu', input_shape=(5,1)))
65-
#model.add(MaxPooling1D(pool_size=2, strides=2))
66-
#model.add(Conv1D(64, 5, activation='relu'))
67-
model.add(Conv1D(16, 3, activation='relu'))
68-
#model.add(MaxPooling1D(pool_size=2))
69-
model.add(Flatten())
70-
#model.add(Dense(1000, activation='relu'))
71-
model.add(Dense(200, activation='relu'))
72-
model.add(Dense(num_classes, activation='softmax'))
64+
#model.add(Conv1D(32, kernel_size=5, strides=1, activation='relu', input_shape=(105,1))) # multi-temporal
65+
model.add(Conv1D(8, kernel_size=3, strides=1, activation='relu', input_shape=(5,1))) # mono-temporal
66+
#model.add(MaxPooling1D(pool_size=2, strides=2)) # multi-temporal
67+
#model.add(Conv1D(64, 5, activation='relu')) # multi-temporal
68+
model.add(Conv1D(16, 3, activation='relu')) # mono-temporal
69+
#model.add(MaxPooling1D(pool_size=2)) # multi-temporal
70+
model.add(Flatten()) # both
71+
#model.add(Dense(1000, activation='relu')) # multi-temporal
72+
model.add(Dense(200, activation='relu')) # mono-temporal
73+
model.add(Dense(num_classes, activation='softmax')) # both
7374

7475
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
7576
return model
@@ -130,6 +131,7 @@ def main():
130131
print("Saved model to disk")
131132

132133
def main_eval():
134+
# Used to evaluate the CNN if the model is already saved (only inference, no training)
133135
json_fname = 'model_10epoch_kings04.json'
134136
h5_fname = 'model_10epoch_kings04.h5'
135137

@@ -162,13 +164,12 @@ def main_eval():
162164

163165
class_names = ['cotton', 'safflower', 'tomatoes', 'wintwheat', 'durwheat', 'idle']
164166
#class_names = ['wintwht/corn', 'alfalfa', 'almonds', 'pistachios', 'idle', 'corn', 'walnuts', 'cotton', 'wintwheat']
165-
plt.figure() #figsize=(8.5,7))
166-
#plt.figure()
167+
plt.figure()
167168
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True, title='Scene #1 Multi-Temporal Confusion Matrix')
168-
#plt.show()
169+
plt.show()
169170
matplotlib.rcParams.update({'font.size': 40})
170-
plt.savefig('Scene1_confmat_TESTBLOCK.png')
171+
#plt.savefig('Scene1_confmat_TESTBLOCK.png')
171172

172173
if __name__ == '__main__':
173-
#main()
174-
main_eval()
174+
main()
175+
#main_eval()

jupyter-notebooks/temporal-crop-classification/classifiers/NN_keras.py

Lines changed: 6 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# This script uses the Sequential model within Keras to implement
2-
# a simple one-hidden-layeer neural network
2+
# a simple one-hidden-layer neural network. Tune parameters for best performance.
33

44
from keras.models import Sequential
55
from keras.layers import Dense
@@ -14,11 +14,6 @@
1414

1515
np.random.seed(10) # for reproducability
1616

17-
def one_hot(y, num_classes):
18-
onehot = np.zeros((y.shape[0], num_classes))
19-
onehot[np.arange(y.shape[0]), y] = 1
20-
return onehot
21-
2217
def load_data():
2318
# Load train, validation, and test data
2419
X = np.load('../datasets/kings_05_rapideye/train_data_single.npy')
@@ -40,11 +35,16 @@ def load_data():
4035
def create_model(units, activation, loss, optimizer, metrics, reg, dropout_rate, weight_constraint):
4136
# Define model as a sequence of layers; Dense models are fully-connected models
4237
model = Sequential()
38+
# Mono-temporal has 5 features (1 timestamp x 5 spectral bands)
4339
model.add(Dense(units=units, input_dim=5, activation=activation, kernel_regularizer=regularizers.l2(reg), kernel_constraint=maxnorm(weight_constraint)))
40+
# Scene 1 has 75 features (15 timestamps x 5 spectral bands)
4441
#model.add(Dense(units=units, input_dim=75, activation=activation, kernel_regularizer=regularizers.l2(reg), kernel_constraint=maxnorm(weight_constraint)))
42+
# Scene 2 has 105 features (21 timestamps x 5 spectral bands)
4543
#model.add(Dense(units=units, input_dim=105, activation=activation, kernel_regularizer=regularizers.l2(reg), kernel_constraint=maxnorm(weight_constraint)))
4644
model.add(Dropout(dropout_rate))
45+
# Scene 1 has 6 output classes
4746
#model.add(Dense(6, activation='softmax'))
47+
# Scene 2 has 9 output classes
4848
model.add(Dense(9, activation='softmax'))
4949
model.compile(optimizer, loss, metrics)
5050
return model
@@ -68,8 +68,6 @@ def predict(model, X, y, X_dev, y_dev, X_test, y_test):
6868
pred_y = model.predict(X)
6969
pred_y_dev = model.predict(X_dev)
7070
pred_y_test = model.predict(X_test)
71-
72-
#print(predictions)
7371
return pred_y, pred_y_dev, pred_y_test
7472

7573
def main():
@@ -85,16 +83,6 @@ def main():
8583
print('epoch: %s' % (epoch))
8684
model.fit(X, y, batch, epoch)
8785
scores, val_scores, test_score = evaluate_model(model, X, y, X_dev, y_dev, X_test, y_test)
88-
89-
#pred_y, pred_y_dev, pred_y_test = predict(model, X, y, X_dev, y_dev, X_test, y_test)
90-
91-
# Summarize results
92-
#print('Best: %f using %s' % (grid_result.best_score_, grid_result.best_params_))
93-
#means = grid_result.cv_results_['mean_test_score']
94-
#stds = grid_result.cv_results_['std_test_score']
95-
#params = grid_result.cv_results_['params']
96-
#for mean, stdev, param in zip(means, stds, params):
97-
# print('%f (%f) with %r' % (mean, stdev, param))
9886

9987
if __name__ == '__main__':
10088
main()

jupyter-notebooks/temporal-crop-classification/classifiers/softmax_sklearn.py

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,21 +6,33 @@
66
from sklearn.metrics import classification_report
77
import pdb
88

9-
X = np.load('../../datasets/kings_04_rapideye/train_data.npy')
10-
X = X*1.0/np.max(X)
11-
y = np.load('../../datasets/kings_04_rapideye/train_lbl_sc.npy')
9+
X_tr = np.load('../../datasets/kings_04_rapideye/train_data.npy')
10+
X_tr = X_tr*1.0/np.max(X_tr)
11+
y_tr = np.load('../../datasets/kings_04_rapideye/train_lbl_sc.npy')
1212

13-
all_X = np.load('../../datasets/kings_04_rapideye/all_data.npy')
14-
all_X = all_X*1.0/np.max(all_X)
15-
all_y = np.load('../../datasets/kings_04_rapideye/all_lbls.npy')
13+
X_val = np.load('../../datasets/kings_04_rapideye/val_data.npy')
14+
X_val = X_val*1.0/np.max(X_val)
15+
y_val = np.load('../../datasets/kings_04_rapideye/val_lbl_sc.npy')
1616

17+
X_test = np.load('../../datasets/kings_04_rapideye/test_data.npy')
18+
X_test = X_test*1.0/np.max(X_test)
19+
y_test = np.load('../../datasets/kings_04_rapideye/test_lbl_sc.npy')
20+
21+
# Define the model, you may need to explore many for the best results.
1722
logreg = linear_model.LogisticRegression(penalty='l2', C=10, solver='saga', max_iter=10000, multi_class='multinomial')
18-
lr_fit = logreg.fit(X, y)
19-
print(logreg.score(X, y))
23+
lr_fit = logreg.fit(X_tr, y_tr)
24+
print(logreg.score(X_tr, y_tr))
2025

21-
predictions = logreg.predict(all_X)
22-
print(logreg.score(all_X, all_y))
23-
print(classification_report(all_y, predictions))
26+
predictions = logreg.predict(X_val)
27+
print(logreg.score(X_val, y_val))
28+
print(classification_report(y_val, predictions))
2429

25-
np.save('logreg_kings04_predictions.npy', predictions)
30+
#np.save('logreg_kings04_predictions.npy', predictions)
2631

32+
# Once the model is tuned for best performance, you can also evaluate the test set.
33+
# If the model performs well on training, but poorly on validation, you have overfit to the training model.
34+
# If the model performs well on training and validation, but poorly on test, you have overfit to the training and validation model.
35+
# If the model performs similarly on training / val / test but all are poor, your model is not complex enough.
36+
predictions_test = logreg.predict(X_test)
37+
print(logreg.score(X_test, y_test))
38+
print(classification_report(y_test, predictions_test))
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
# Implementation of SVMs with sklearn
1+
# Implementation of SVMs with sklearn. You may need to tweak parameters
2+
# to achieve successful classification results. Depending on the size of
3+
# the training set, the SVM code may take a long time to run.
24

35
import numpy as np
46
import pickle
@@ -8,31 +10,25 @@
810
from sklearn.metrics import classification_report
911
import matplotlib.pyplot as plt
1012

13+
# Load training
1114
X_tr = np.load('../datasets/kings_04_rapideye/train_data.npy')
12-
1315
X_tr = X_tr*1.0/np.max(X_tr)
1416
y_tr = np.load('../datasets/kings_04_rapideye/train_lbl_sc.npy')
1517
X = X_tr[0:60000,:]
1618
y = y_tr[0:60000]
17-
print(X.shape)
18-
print(y)
1919

20+
# Load validation
2021
X_val = np.load('../datasets/kings_04_rapideye/val_data.npy')
2122
X_val = X_val*1.0/np.max(X_val)
2223
y_val = np.load('../datasets/kings_04_rapideye/val_lbl_sc.npy')
23-
print(X_val.shape)
24-
print(y_val)
25-
print(y_val.shape)
2624

25+
# Load test
2726
X_test = np.load('../datasets/kings_04_rapideye/test_data.npy')
2827
X_test = X_test*1.0/np.max(X_test)
2928
y_test = np.load('../datasets/kings_04_rapideye/test_lbl_sc.npy')
30-
print(X_test.shape)
31-
print(y_test)
32-
print(y_test.shape)
3329

34-
for C in [1000000]: #[100000, 1000000]:
35-
for gamma in [0.1]: #[0.001, 0.01]:
30+
for C in [1, 10, 100, 1000, 10000, 100000, 1000000]:i #1000000
31+
for gamma in [0.001, 0.01, 0.1]: #0.1
3632
print('---------')
3733
print('C: , gamma: ')
3834
print(C, gamma)
@@ -46,14 +42,17 @@
4642
print('Acc:')
4743
print(accuracy_score(y, y_pred))
4844
y_val_pred = rbf_svc.predict(X_val)
49-
print('Result on validation set')
45+
print('Result on validation set')
5046
print(classification_report(y_val, y_val_pred))
5147
print('Acc:')
5248
print(accuracy_score(y_val, y_val_pred))
53-
y_test_pred = rbf_svc.predict(X_test)
54-
print('Result on test set')
55-
print(classification_report(y_test, y_test_pred))
56-
print('Acc:')
57-
print(accuracy_score(y_test, y_test_pred))
58-
fname = 'svm_kings04_temporal.sav'
59-
pickle.dump(rbf_svc, open(fname, 'wb'))
49+
# The model should be chosen based on the performance of the validation set. Do not view test results until the model is tweaked for best performance.
50+
#y_test_pred = rbf_svc.predict(X_test)
51+
#print('Result on test set')
52+
#print(classification_report(y_test, y_test_pred))
53+
#print('Acc:')
54+
#print(accuracy_score(y_test, y_test_pred))
55+
56+
# Save the model
57+
#fname = 'svm_kings04_temporal.sav'
58+
#pickle.dump(rbf_svc, open(fname, 'wb'))

0 commit comments

Comments
 (0)