Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
110 commits
Select commit Hold shift + click to select a range
143a155
feat: Add General Layer class
Jul 17, 2017
8328db9
feat: Register the new Deep Learning Method
IlievskiV Jul 17, 2017
33e254b
feat: Define Deep Net class
IlievskiV Jul 17, 2017
8af8ac6
feat: Define Deep Net Method class
IlievskiV Jul 17, 2017
99dd5dc
feat: Add Dense Layer class
IlievskiV Jul 17, 2017
b780c6a
feat: Implement Conv and Max Pool Layer propagation backend
IlievskiV Jul 17, 2017
6074636
feat: Implement Conv Layer Class
IlievskiV Jul 17, 2017
b53ff3a
feat: Implement Max Pool Layer Class
IlievskiV Jul 17, 2017
5756fb6
feat: Define Reshape Layer Class
IlievskiV Jul 17, 2017
4727600
feat: Implement Tensor Data Loader Class
IlievskiV Jul 17, 2017
a799889
feat: Implement Copy Tensor Input and Copy Tensor Output methods
IlievskiV Jul 17, 2017
1d0bfd9
feat: Implement Deep Learning Minimizers
IlievskiV Jul 17, 2017
432fca9
feat: Implement Creatre Deep Net and the Parsing Layer Methods
IlievskiV Jul 17, 2017
b6b7b8e
feat: Insert Fetch Helper Methods
IlievskiV Jul 17, 2017
a258701
feat: Insert Declare Options and Parse Key Value String methods
IlievskiV Jul 17, 2017
fc89247
feat: Implement Process Options method
IlievskiV Jul 17, 2017
305446d
feat: Implement Train GPU method
IlievskiV Jul 17, 2017
0c64f5f
feat: Define Conv and Max Pool Layer propagation CPU backend
IlievskiV Jul 18, 2017
c1d2c74
feat: Define Conv and Max Pool Layer propagation GPU backend
IlievskiV Jul 18, 2017
a0ef9df
fix:Add 'public' key word in the inheritance
IlievskiV Jul 18, 2017
040a1d7
fix:Include CPU and GPU beckends, conditionally
IlievskiV Jul 18, 2017
9878bbd
feat:Implement Deep Net class
IlievskiV Jul 18, 2017
7008bf3
feat: Add weight matrix in the Tensor Batch class
IlievskiV Jul 18, 2017
8cf90e1
fix: Wrong method names
IlievskiV Jul 18, 2017
0b4c028
fix: Change the method signatures
IlievskiV Jul 18, 2017
fde26b2
fix: Include headers in Method DL
IlievskiV Jul 18, 2017
779162e
feat: Define Reshape kernel
IlievskiV Jul 19, 2017
24e3ce5
feat: Implement Forward and Backward pass in Reshape Layer
IlievskiV Jul 19, 2017
076e5a2
test: Add Im2Col, Downsample and RotateWeights tests
IlievskiV Jul 19, 2017
b312d93
test: Implement function for creating test conv net
IlievskiV Jul 19, 2017
1e907ee
test: Implement Forward pass test
IlievskiV Jul 19, 2017
e7d21ad
test: Implement Conv Loss function test
IlievskiV Jul 19, 2017
9e262dd
test: Implement Conv Prediction function test
IlievskiV Jul 19, 2017
5d65370
test: Implement Conv Backpropagation test
IlievskiV Jul 20, 2017
537c58d
RNNLayer added v1
sshekh Jul 20, 2017
ce65582
ScaleAdd and GetMatrix functions on vectors added
sshekh Jul 21, 2017
2b759c5
Adding Denoise Layer for DeepAutoEncoders
ajatgd Jul 21, 2017
5d071dd
Adding Transform Layer for Deep AutoEncoders
ajatgd Jul 21, 2017
097b3ff
Adding Tensor input and Forward in Denoise Layer
ajatgd Jul 22, 2017
037f613
Fixing a small bug in Denoise Layer
ajatgd Jul 22, 2017
85dc350
Adding DenoisePropagation methods for Reference Architecture
ajatgd Jul 23, 2017
6bc4c32
adding test for Denoise Layer Propagation
ajatgd Jul 23, 2017
4c5982a
Adding Denoise Layers to DeepNet
ajatgd Jul 23, 2017
e3a6602
Adding Logistic Regression Layer and removing Transformed Layer as it…
ajatgd Jul 25, 2017
ab38f5b
Adding tests for Logistic Regression Layer
ajatgd Jul 25, 2017
8f3cea6
Adding Logistic Regression Layer to DeepNet
ajatgd Jul 25, 2017
97f821e
refactor: Migrate to vector of weights and biases, DAE Build Breaking
sshekh Jul 27, 2017
1de838a
refactor: pointers removed from ScaleAdd and Copy signatures
sshekh Jul 27, 2017
5b6aa05
Refactor: Adding Corruption, Compression, Reconstruction layer in acc…
ajatgd Jul 28, 2017
1344d22
Refactor: Adding modified Layers to DeepNet and adding pretrain
ajatgd Jul 28, 2017
48844a0
Refactor: Migrating layers to new general layer constructor, adding d…
ajatgd Jul 31, 2017
45bf15d
Refactor: Adding two parameters to Backward in all layers
ajatgd Aug 1, 2017
a68eb04
Forward test RNN added
sshekh Aug 1, 2017
0890382
Adding FineTune function in DeepNet and test for same
ajatgd Aug 2, 2017
34fb0c6
Adding an attribute for the type of layer in General Layer
ajatgd Aug 3, 2017
5af8f1b
refactor: Format the coding style
IlievskiV Aug 5, 2017
a0807ff
feat: Implement the CPU architecture for Conv Layers
IlievskiV Aug 5, 2017
2cd5e52
feat: Implement Copy function in Tensor Data Loader
IlievskiV Aug 5, 2017
04ab41d
Full example added
sshekh Aug 6, 2017
742f92e
Removing Layer Type attribute from general layer and adding docs for …
ajatgd Aug 6, 2017
8ff7195
test: Add Im2Col, Downsample and RotateWeights tests for CPU
IlievskiV Aug 7, 2017
f00eb50
test: Add Conv Forward Pass Test for CPU
IlievskiV Aug 7, 2017
8344e6e
test: Add Conv Net Loss function test for CPU
IlievskiV Aug 7, 2017
4c7675c
test: Add Conv Net Prediction function test for CPU
IlievskiV Aug 7, 2017
4fa3b09
feat: Implement Tensor Data Loader for Reference
IlievskiV Aug 7, 2017
97d2d89
fix: Input Tensor not initialized properly
IlievskiV Aug 8, 2017
fe615d4
feat: Add function for constructing linear conv net
IlievskiV Aug 8, 2017
8f85bde
test: Add test for Tensor Data Loader for Reference backend
IlievskiV Aug 8, 2017
7d1d83f
feat: Define Flatten and Deflatten kernels
IlievskiV Aug 8, 2017
bdae1c8
feat: Implement Flatten and Delfatten for Reference and CPU
IlievskiV Aug 8, 2017
9be02e5
test: Add Tensor Data Loader test for CPU backend
IlievskiV Aug 8, 2017
60710e5
test: Add test for Flatten for the Reference backend
IlievskiV Aug 8, 2017
b3034ac
feat: Add flattening option in the Reshape Layer
IlievskiV Aug 9, 2017
4222250
fix: Bug fix in the Conv Layer Backprop step
IlievskiV Aug 9, 2017
e089324
temp: Full RNN fixes
sshekh Aug 9, 2017
c2d5ec8
fix: Fix Conv Layer Backward
IlievskiV Aug 13, 2017
962b40b
fix: Change to reference input in the Forward call
IlievskiV Aug 13, 2017
9e4c340
feat: Add test for loading real dataset
IlievskiV Aug 13, 2017
b83c9b0
test: Add tests for minimizers
IlievskiV Aug 13, 2017
4f65dec
feat: Define input layout string
IlievskiV Aug 14, 2017
9f938d1
test: Add test for testing Method DL for CPU
IlievskiV Aug 15, 2017
ad254f3
fix: Multiply Transponse errot for CPU backend
IlievskiV Aug 15, 2017
fe26262
feat: Backprop test for Denselayer added
sshekh Aug 17, 2017
f536b42
fix: Add condition for dummy backward gradients in the Dense Layer
IlievskiV Aug 22, 2017
64f7171
feat: Define batch layout string
IlievskiV Aug 22, 2017
72d37e2
feat: Add additional condition for loading batches
IlievskiV Aug 22, 2017
138d8e1
test: Add test for Method DL, for the DNN case
IlievskiV Aug 22, 2017
3a89807
test: Add test for Method DL, for DNN case
IlievskiV Aug 22, 2017
ed24c1e
fix: Initialize bias gradients to zero
IlievskiV Aug 23, 2017
30b337e
MethodDL RNN Parser added
sshekh Aug 25, 2017
cc2bfd0
RNN dimensions changed and full network working
sshekh Aug 29, 2017
83d71ed
CPU (Blas) Support added
sshekh Aug 29, 2017
13419bc
Added Cuda Support in recurrent propagation
sshekh Oct 4, 2017
2fc6fa7
Minor changes, methodDL multi-threading in Minimizer removed
sshekh Oct 5, 2017
c7f2f59
Minor change params of RNNLayer
sshekh Oct 5, 2017
5f92805
TMVA: implemented method GetMvaValue for method DL
omazapa Oct 12, 2017
c3ba4f2
TMVA:
omazapa Oct 12, 2017
dd0ab43
TMVA: removed compilation warnings
omazapa Oct 12, 2017
e6ccfb0
TMVA: moved fPool from TCpuMatrix to TMVA::Config class and removed m…
omazapa Oct 13, 2017
8ba7c95
FIX: MVaValue Calculation in Cpu Architecture
sshekh Oct 20, 2017
79316fd
TMVA: remove warnings in DenoisePropagation.cxx and Propagation.cxx
omazapa Oct 24, 2017
893ac3c
TMVA: removed warnings in TensorDataLoader and TestBackpropagationDL
omazapa Oct 24, 2017
196978b
TMVA: removing more warnings from multiple types of layers and in som…
omazapa Oct 24, 2017
9d76c74
TMVA: removed more warnings
omazapa Oct 24, 2017
048d589
Fix test file name and input batch layout
lmoneta Oct 24, 2017
27c056f
Fix layout string for DNN test. Need a reshape layer before a DNN layer
lmoneta Oct 25, 2017
3d6d6a3
Fix input parameter for Reshape Layer
lmoneta Oct 25, 2017
a353317
Thanks to Vladimir fix weight gradient and activation gradient compu…
lmoneta Nov 9, 2017
d590c6f
Remove some debug print out and improve test
lmoneta Nov 9, 2017
2557947
Fix the padding when computing the activation gradient
lmoneta Nov 10, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
temp: Full RNN fixes
  • Loading branch information
sshekh committed Aug 9, 2017
commit e08932468ef6e104d50ab07381245cd838391847
3 changes: 3 additions & 0 deletions tmva/tmva/inc/TMVA/DNN/Architectures/Cpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,9 @@ class TCpu
* tensor \p B. */
static void Deflatten(std::vector<TCpuMatrix<AReal>> &A, const TCpuMatrix<AReal> &B, size_t index, size_t nRows,
size_t nCols);
/** Rearrage data accoring to time fill B x T x D out with T x B x D matrix in*/
static void Rearrange(std::vector<TCpuMatrix<AReal>> &out, const std::vector<TCpuMatrix<AReal>> &in);


///@}

Expand Down
2 changes: 2 additions & 0 deletions tmva/tmva/inc/TMVA/DNN/Architectures/Cuda.h
Original file line number Diff line number Diff line change
Expand Up @@ -392,6 +392,8 @@ class TCuda
/** Transforms each row of \p B to a matrix and stores it in the tensor \p B. */
static void Deflatten(std::vector<TCudaMatrix<AFloat>> &A, const TCudaMatrix<AFloat> &B, size_t index, size_t nRows,
size_t nCols);
/** Rearrage data accoring to time fill B x T x D out with T x B x D matrix in*/
static void Rearrange(std::vector<TCudaMatrix<AReal>> &out, const std::vector<TCudaMatrix<AReal>> &in);

///@}

Expand Down
2 changes: 2 additions & 0 deletions tmva/tmva/inc/TMVA/DNN/Architectures/Reference.h
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,8 @@ class TReference
/** Transforms each row of \p B to a matrix and stores it in the tensor \p B. */
static void Deflatten(std::vector<TMatrixT<AReal>> &A, const TMatrixT<Scalar_t> &B, size_t index, size_t nRows,
size_t nCols);
/** Rearrage data accoring to time fill B x T x D out with T x B x D matrix in*/
static void Rearrange(std::vector<TMatrixT<AReal>> &out, const std::vector<TMatrixT<AReal>> &in);

///@}

Expand Down
36 changes: 29 additions & 7 deletions tmva/tmva/inc/TMVA/DNN/DeepNet.h
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ class TDeepNet {

/*! Function for adding Recurrent Layer in the Deep Neural Network,
* with given parameters */
TBasicRNNLayer<Architecture_t> *AddBasicRNNLayer(size_t batchSize, size_t stateSize, size_t inputSize,
TBasicRNNLayer<Architecture_t> *AddBasicRNNLayer(size_t stateSize, size_t inputSize,
size_t timeSteps, bool rememberState = false);

/*! Function for adding Vanilla RNN when the layer is already created
Expand Down Expand Up @@ -491,12 +491,12 @@ void TDeepNet<Architecture_t, Layer_t>::AddMaxPoolLayer(TMaxPoolLayer<Architectu

//______________________________________________________________________________
template <typename Architecture_t, typename Layer_t>
TBasicRNNLayer<Architecture_t> *TDeepNet<Architecture_t, Layer_t>::AddBasicRNNLayer(size_t batchSize, size_t stateSize,
TBasicRNNLayer<Architecture_t> *TDeepNet<Architecture_t, Layer_t>::AddBasicRNNLayer(size_t stateSize,
size_t inputSize, size_t timeSteps,
bool rememberState)
{
TBasicRNNLayer<Architecture_t> *basicRNNLayer = new TBasicRNNLayer<Architecture_t>(
batchSize, stateSize, inputSize, timeSteps, rememberState, DNN::EActivationFunction::kTanh, fIsTraining);
this->GetBatchSize(), stateSize, inputSize, timeSteps, rememberState, DNN::EActivationFunction::kTanh, fIsTraining, this->GetInitialization());
fLayers.push_back(basicRNNLayer);
return basicRNNLayer;
}
Expand Down Expand Up @@ -684,6 +684,22 @@ auto TDeepNet<Architecture_t, Layer_t>::Initialize() -> void
}
}

template <typename Architecture>
auto debugTensor(const std::vector<typename Architecture::Matrix_t> &A, const std::string name = "tensor")
-> void
{
std::cout << name << "\n";
for (size_t l = 0; l < A.size(); ++l) {
for (size_t i = 0; i < A[l].GetNrows(); ++i) {
for (size_t j = 0; j < A[l].GetNcols(); ++j) {
std::cout << A[l](i, j) << " ";
}
std::cout << "\n";
}
std::cout << "********\n";
}
}

//______________________________________________________________________________
template <typename Architecture_t, typename Layer_t>
auto TDeepNet<Architecture_t, Layer_t>::Forward(std::vector<Matrix_t> input, bool applyDropout) -> void
Expand Down Expand Up @@ -715,6 +731,7 @@ auto TDeepNet<Architecture_t, Layer_t>::ParallelForward(std::vector<TDeepNet<Arc
}
}
}

//_____________________________________________________________________________
template <typename Architecture_t, typename Layer_t>
auto TDeepNet<Architecture_t, Layer_t>::PreTrain(std::vector<Matrix_t> &input,
Expand Down Expand Up @@ -804,6 +821,7 @@ auto TDeepNet<Architecture_t, Layer_t>::PreTrain(std::vector<Matrix_t> &input,
fLayers.back()->Print();
}
}

//______________________________________________________________________________
template <typename Architecture_t, typename Layer_t>
auto TDeepNet<Architecture_t, Layer_t>::FineTune(std::vector<Matrix_t> &input, std::vector<Matrix_t> &testInput,
Expand Down Expand Up @@ -846,14 +864,18 @@ auto TDeepNet<Architecture_t, Layer_t>::Backward(std::vector<Matrix_t> input, co
evaluateGradients<Architecture_t>(fLayers.back()->GetActivationGradientsAt(0), this->GetLossFunction(), groundTruth,
fLayers.back()->GetOutputAt(0), weights);
for (size_t i = fLayers.size() - 1; i > 0; i--) {
std::vector<Matrix_t> activation_gradient_backward = fLayers[i - 1]->GetActivationGradients();
std::vector<Matrix_t> activations_backward = fLayers[i - 1]->GetOutput();
std::vector<Matrix_t> &activation_gradient_backward = fLayers[i - 1]->GetActivationGradients();
std::vector<Matrix_t> &activations_backward = fLayers[i - 1]->GetOutput();
fLayers[i]->Backward(activation_gradient_backward, activations_backward, inp1, inp2);
//debugTensor<Architecture_t>(activation_gradient_backward, "act grad backward after back of dense");
}

std::vector<Matrix_t> dummy;
std::vector<Matrix_t> gradient_input;
for (size_t i = 0; i < input.size(); i++) {
gradient_input.emplace_back(input[i].GetNrows(), input[i].GetNcols());
}

fLayers[0]->Backward(dummy, input, inp1, inp2);
fLayers[0]->Backward(gradient_input, input, inp1, inp2);
}

//______________________________________________________________________________
Expand Down
125 changes: 59 additions & 66 deletions tmva/tmva/inc/TMVA/DNN/RNN/RNNLayer.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,16 +63,6 @@ template<typename Architecture_t>

private:

/* from GeneralLayer:
* fBatchSize
* fInputDepth = 1
* fInputHeight = 1
* fInputWidth = inputSize
* fOutputDepth = 1
* fOutputHeight = 1
* fOutputWidth = stateSize
* fOutput = timeSteps x batchSize x stateSize */

size_t fTimeSteps; ///< Timesteps for RNN
size_t fStateSize; ///< Hidden state size of RNN
bool fRememberState; ///< Remember state in next pass
Expand All @@ -95,7 +85,7 @@ template<typename Architecture_t>
TBasicRNNLayer(size_t batchSize, size_t stateSize, size_t inputSize,
size_t timeSteps, bool rememberState = false,
DNN::EActivationFunction f = DNN::EActivationFunction::kTanh,
bool training = true);
bool training = true, DNN::EInitialization fA = DNN::EInitialization::kZero);

/** Copy Constructor */
TBasicRNNLayer(const TBasicRNNLayer &);
Expand Down Expand Up @@ -131,17 +121,9 @@ template<typename Architecture_t>
const Matrix_t & precStateActivations, const Matrix_t & currStateActivations,
const Matrix_t & input, Matrix_t & input_gradient);

/*! Return a vector of all learnable weights */
//std::vector<Matrix_t*> GetWeights() const;

///*! Return a vector of all learnable weights' gradients */
//std::vector<Matrix_t*> GetWeightGradients() const;

/*! Return a vector of all learnable biases */
//std::vector<Matrix_t*> GetBiases();

/*! Return a vector of all learnable bias' gradients */
//std::vector<Matrix_t*> GetBiasGradients();
/*! Rearrage data accoring to time
* fill B x T x D out with T x B x D matrix in*/
//void Rearrange(Tensor_t &out, const Tensor_t &in);

/** Prints the info about the layer */
void Print() const;
Expand All @@ -158,10 +140,10 @@ template<typename Architecture_t>
const Matrix_t & GetWeightsInput() const {return fWeightsInput;}
Matrix_t & GetWeightsState() {return fWeightsState;}
const Matrix_t & GetWeightsState() const {return fWeightsState;}
Matrix_t & GetBiases() {return fBiases;}
const Matrix_t & GetBiases() const {return fBiases;}
Matrix_t & GetBiasGradients() {return fBiasGradients;}
const Matrix_t & GetBiasGradients() const {return fBiasGradients;}
//Matrix_t & GetBiases() {return fBiases;}
//const Matrix_t & GetBiases() const {return fBiases;}
//Matrix_t & GetBiasGradients() {return fBiasGradients;}
//const Matrix_t & GetBiasGradients() const {return fBiasGradients;}
Matrix_t & GetWeightInputGradients() {return fWeightInputGradients;}
const Matrix_t & GetWeightInputGradients() const {return fWeightInputGradients;}
Matrix_t & GetWeightStateGradients() {return fWeightStateGradients;}
Expand All @@ -177,9 +159,9 @@ template<typename Architecture_t>
TBasicRNNLayer<Architecture_t>::TBasicRNNLayer(size_t batchSize, size_t stateSize, size_t inputSize,
size_t timeSteps, bool rememberState,
DNN::EActivationFunction f,
bool training)
bool training, DNN::EInitialization fA)
: VGeneralLayer<Architecture_t>(batchSize, 1, 1, inputSize, 1, 1, stateSize, 2, {stateSize, stateSize}, {inputSize, stateSize},
1, {stateSize}, {1}, timeSteps, batchSize, stateSize, DNN::EInitialization::kZero),
1, {stateSize}, {1}, timeSteps, batchSize, stateSize, fA),
fTimeSteps(timeSteps), fStateSize(stateSize), fRememberState(rememberState), fWeightsInput(this->GetWeightsAt(0)), fF(f),
fState(batchSize, stateSize), fWeightsState(this->GetWeightsAt(1)), fBiases(this->GetBiasesAt(0)), fDerivatives(batchSize, stateSize),
fWeightInputGradients(this->GetWeightGradientsAt(0)), fWeightStateGradients(this->GetWeightGradientsAt(1)), fBiasGradients(this->GetBiasGradientsAt(0))
Expand Down Expand Up @@ -230,47 +212,46 @@ auto TBasicRNNLayer<Architecture_t>::Print() const
<< "Hidden State Size: " << this->GetStateSize() << "\n";
}

////______________________________________________________________________________
//template<typename Architecture_t>
//auto TBasicRNNLayer<Architecture_t>::GetWeights() const
//-> std::vector<Matrix_t*>
//{
// std::vector<Matrix_t*> weights;
// weights.emplace_back(&fWeightsInput);
// weights.emplace_back(&fWeightsState);
// return weights;
//}
//
////______________________________________________________________________________
//template<typename Architecture_t>
//auto TBasicRNNLayer<Architecture_t>::GetWeightGradients() const
//-> std::vector<Matrix_t*>
//{
// std::vector<Matrix_t*> weightGradients;
// weightGradients.emplace_back(&fWeightInputGradients);
// weightGradients.emplace_back(&fWeightStateGradients);
// return weightGradients;
//}
//
//______________________________________________________________________________
//template<typename Architecture_t>
//auto TBasicRNNLayer<Architecture_t>::GetBiases() const
//-> std::vector<Matrix_t*>
//auto TBasicRNNLayer<Architecture_t>::Rearrange(Tensor_t &out, const Tensor_t &in)
//-> void
//{
// std::vector<Matrix_t*> biases;
// biases.emplace_back(&fBiases);
// return biases;
// // B x T x D out --- T x B x D in*/
// size_t B = out.size();
// size_t T = out[0].GetNrows();
// size_t D = out[0].GetNcols();
// if ((T != in.size()) || (B != in[0].GetNrows())
// || (D != in[0].GetNcols())) {
// std::cout << "Incompatible Dimensions\n"
// << in.size() << "x" << in[0].GetNrows() << "x" << in[0].GetNcols()
// << " --> " << B << "x" << T << "x" << D << "\n";
// return;
// }
// for (size_t i = 0; i < B; ++i) {
// for (size_t j = 0; j < T; ++j) {
// for (size_t k = 0; k < D; ++k) {
// out[i](j, k) = in[j](i, k);
// }
// }
// }
// return;
//}

//______________________________________________________________________________
//template<typename Architecture_t>
//auto TBasicRNNLayer<Architecture_t>::GetBiasGradients() const
//-> std::vector<Matrix_t*>
//{
// std::vector<Matrix_t*> biasGradients;
// biasGradients.emplace_back(&fBiasGradients);
// return biasGradients;
//}
template <typename Architecture>
auto debugMatrix(const typename Architecture::Matrix_t &A, const std::string name = "matrix")
-> void
{
std::cout << name << "\n";
for (size_t i = 0; i < A.GetNrows(); ++i) {
for (size_t j = 0; j < A.GetNcols(); ++j) {
std::cout << A(i, j) << " ";
}
std::cout << "\n";
}
std::cout << "********\n";
}


//______________________________________________________________________________
template <typename Architecture_t>
Expand All @@ -291,9 +272,12 @@ auto inline TBasicRNNLayer<Architecture_t>::CellForward(Matrix_t &input)
{
// State = act(W_input . input + W_state . state + bias)
const DNN::EActivationFunction fF = this->GetActivationFunction();
//debugMatrix<Architecture_t>(input, "input");
Matrix_t tmpState(fState.GetNrows(), fState.GetNcols());
Architecture_t::MultiplyTranspose(tmpState, fState, fWeightsState);
Architecture_t::MultiplyTranspose(fState, input, fWeightsInput);
//debugMatrix<Architecture_t>(fWeightsInput, "weights input");
//debugMatrix<Architecture_t>(fState, "fState");
Architecture_t::ScaleAdd(fState, tmpState);
Architecture_t::AddRowWise(fState, fBiases);
DNN::evaluate<Architecture_t>(fState, fF);
Expand All @@ -310,7 +294,7 @@ auto inline TBasicRNNLayer<Architecture_t>::Backward(Tensor_t &gradients_backwar
// activations backward is input
// gradients_backward is activationGradients of layer before it, which is input layer
// currently gradient_backward is for input(x) and not for state
// we also need the one for state as
// TODO use this to change initial state??
Matrix_t state_gradients_backward(this->GetBatchSize(), fStateSize); // B x H
DNN::initialize<Architecture_t>(state_gradients_backward, DNN::EInitialization::kZero);

Expand Down Expand Up @@ -340,9 +324,18 @@ auto inline TBasicRNNLayer<Architecture_t>::CellBackward(Matrix_t & state_gradie
-> Matrix_t &
{
DNN::evaluateDerivative<Architecture_t>(fDerivatives, this->GetActivationFunction(), currStateActivations);
return Architecture_t::RecurrentLayerBackward(state_gradients_backward, fWeightInputGradients, fWeightStateGradients,
//debugMatrix<Architecture_t>(state_gradients_backward, "0 state grad");
//debugMatrix<Architecture_t>(fWeightInputGradients, "0 wx grad");
//debugMatrix<Architecture_t>(fWeightStateGradients, "0 wh grad");
//debugMatrix<Architecture_t>(fDerivatives, "bef df");
auto &lol = Architecture_t::RecurrentLayerBackward(state_gradients_backward, fWeightInputGradients, fWeightStateGradients,
fBiasGradients, fDerivatives, precStateActivations, fWeightsInput,
fWeightsState, input, input_gradient);
//debugMatrix<Architecture_t>(state_gradients_backward, "state grad");
//debugMatrix<Architecture_t>(fWeightInputGradients, "wx grad");
//debugMatrix<Architecture_t>(fWeightStateGradients, "wh grad");
//debugMatrix<Architecture_t>(fDerivatives, "df");
return lol;
}

} // namespace RNN
Expand Down
25 changes: 25 additions & 0 deletions tmva/tmva/src/DNN/Architectures/Cpu/Propagation.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -394,5 +394,30 @@ void TCpu<AFloat>::Deflatten(std::vector<TCpuMatrix<AFloat>> &A, const TCpuMatri
}
}

//______________________________________________________________________________
template <typename AReal>
void TCpu<AReal>::Rearrange(std::vector<TCpuMatrix<AReal>> &out, const std::vector<TCpuMatrix<AReal>> &in)
{
// B x T x D out --- T x B x D in*/
size_t B = out.size();
size_t T = out[0].GetNrows();
size_t D = out[0].GetNcols();
if ((T != in.size()) || (B != in[0].GetNrows())
|| (D != in[0].GetNcols())) {
std::cout << "Incompatible Dimensions\n"
<< in.size() << "x" << in[0].GetNrows() << "x" << in[0].GetNcols()
<< " --> " << B << "x" << T << "x" << D << "\n";
return;
}
for (size_t i = 0; i < B; ++i) {
for (size_t j = 0; j < T; ++j) {
for (size_t k = 0; k < D; ++k) {
out[i](j, k) = in[j](i, k);
}
}
}
return;
}

} // namespace DNN
} // namespace TMVA
27 changes: 26 additions & 1 deletion tmva/tmva/src/DNN/Architectures/Cuda/Propagation.cu
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,32 @@ void TCuda<AFloat>::MaxPoolLayerBackward(std::vector<TCudaMatrix<AFloat>> & acti
template<typename AFloat>
void TCuda<AFloat>::Reshape(TCudaMatrix<AFloat> &A, const TCudaMatrix<AFloat> &B)
{

//TODO
}

//______________________________________________________________________________
template <typename AReal>
void TCuda<AReal>::Rearrange(std::vector<TCudaMatrix<AReal>> &out, const std::vector<TCudaMatrix<AReal>> &in)
{
// B x T x D out --- T x B x D in*/
size_t B = out.size();
size_t T = out[0].GetNrows();
size_t D = out[0].GetNcols();
if ((T != in.size()) || (B != in[0].GetNrows())
|| (D != in[0].GetNcols())) {
std::cout << "Incompatible Dimensions\n"
<< in.size() << "x" << in[0].GetNrows() << "x" << in[0].GetNcols()
<< " --> " << B << "x" << T << "x" << D << "\n";
return;
}
for (size_t i = 0; i < B; ++i) {
for (size_t j = 0; j < T; ++j) {
for (size_t k = 0; k < D; ++k) {
out[i](j, k) = in[j](i, k);
}
}
}
return;
}

//____________________________________________________________________________
Expand Down
Loading