Skip to content

Conversation

AndreaCatania
Copy link
Contributor

Added set and get to the Tensor, closes #222.

You should consider to use usize instead of u64 for the Tensor dims, so we don't need to convert u64 to usize.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@AndreaCatania
Copy link
Contributor Author

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

Copy link
Contributor

@adamcrume adamcrume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use some tests. For example, creating a [2, 3, 5]-shape tensor, setting element [1, 1, 1] with set, and verifying that the element at index 1+15+15*3 =21 is set.

src/lib.rs Outdated
/// assert_eq!(a.get_index(&[1, 0]), 1);
/// ```
pub fn get_index(&self, indices: &[u64]) -> usize {
assert!(self.dims.len() >= indices.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's any real benefit to allowing users to specify only part of the index, and it's likely to cause confusion. I would just require self.dims.len() == indices.len().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I agree, done

let mut d = 1;
for i in 0..indices.len() {
assert!(self.dims[i] > indices[i]);
index += indices[i] * d;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs on Tensor say that element 1 corresponds to index [0, ..., 1], but this code is mapping element 1 to index [1, ..., 0].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I always used it inversely; Fixed.

@adamcrume
Copy link
Contributor

I'm using u64 for the dimensions because the underlying TensorFlow C API uses int64_t for dimensions. I use unsigned instead of signed whenever we know that -1 isn't supported, but at least we don't have to worry about int64_t versus size_t issues with the C API. It does make the Rust API a little funny, but I'm trying to avoid edge cases where the C API works but the Rust API doesn't.

@AndreaCatania
Copy link
Contributor Author

AndreaCatania commented Oct 15, 2019

This could use some tests. For example, creating a [2, 3, 5]-shape tensor, setting element [1, 1, 1] with set, and verifying that the element at index 1+1_5+1_5*3 =21 is set.

I've modified the previous test with the above one, you can find it in the get function.


I'm using u64 for the dimensions because the underlying TensorFlow C API uses int64_t for dimensions.

The things is that you can't access any array (an so a tensor) using a type different than a usize, this mean that at some point the user have to convert a u64 to usize.

With the actual implementation you can create Tensors of u64 size even in a 16 bit or 32 bit architectures (where the usize is 16 or 32 bits) this mean that part of the Tensor is inaccessible.

On the other hand, converting a usize to u64 it is always possible and cause less problem (since you can't anymore create Tensor of unsupported sizes). So I think that the conversion should be handled safely internally by this crate.

@adamcrume
Copy link
Contributor

Hm, you're right that users can't actually fit a full, materialized 64-bit-sized tensor in memory on a 32-bit architecture. I was concerned about the indexes used here being the same type as what's returned by Tensor::dims, but users are already using usize to index into tensors right now, which is the more natural type. Feel free to change it to usize.

@AndreaCatania
Copy link
Contributor Author

AndreaCatania commented Oct 16, 2019

I think that for now it's better to use u64 for setters and getters too, and avoid confusion.
But in my opinion in future the Tensor should use usize even for dimensions and perform the conversion to u64 internally.

@adamcrume
Copy link
Contributor

We can't change the Tensor dimensions from u64 to usize without making a breaking change, unless we simply add separate methods. It would also mean a mismatch between Tensor dimension types and dimension types for tensors in the graph which are not fully materialized on the current machine. We also have to consider sparse tensors, which could easily have a u64 address space, even if only a couple of items are filled out.

This is something we'll need to revisit before moving to version 1.0.

@adamcrume
Copy link
Contributor

Thanks for the contribution!

@adamcrume adamcrume merged commit dc20fa5 into tensorflow:master Oct 17, 2019
@AndreaCatania AndreaCatania deleted the setget branch October 18, 2019 08:48
@AndreaCatania
Copy link
Contributor Author

Welcome and Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set data to a Tensor in Row/Column style
3 participants