Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

Migrate kernel specialization infra to ptx_dispatch#1735

Closed
alliepiper wants to merge 7 commits intoNVIDIA:mainfrom
alliepiper:if_target
Closed

Migrate kernel specialization infra to ptx_dispatch#1735
alliepiper wants to merge 7 commits intoNVIDIA:mainfrom
alliepiper:if_target

Conversation

@alliepiper
Copy link
Collaborator

No description provided.

@alliepiper alliepiper added this to the 2.0.0 milestone Jul 1, 2022
@alliepiper alliepiper marked this pull request as draft July 1, 2022 03:07
@alliepiper
Copy link
Collaborator Author

run tests

Copy link
Collaborator

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cursory glances

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

3 similar comments
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper modified the milestones: 2.0.0, 2.1.0 Aug 4, 2022
@alliepiper alliepiper force-pushed the if_target branch 3 times, most recently from e04685b to a7d19dc Compare September 13, 2022 12:41
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper marked this pull request as ready for review September 14, 2022 20:24
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper added type: enhancement New feature or request. P0: must have Absolutely necessary. Critical issue, major blocker, etc. labels Sep 14, 2022
@alliepiper alliepiper added the release: breaking change Include in "Breaking Changes" section of release notes. label Sep 14, 2022
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper force-pushed the if_target branch 2 times, most recently from 0002fc2 to c86a3b7 Compare September 23, 2022 19:47
@alliepiper
Copy link
Collaborator Author

Resolved conflicts, restarting CI for new CUB tests.

run tests

@alliepiper alliepiper force-pushed the if_target branch 2 times, most recently from 05c3527 to a78f492 Compare September 23, 2022 20:31
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

Builds were interrupted by a Jenkin upgrade.

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests


if (d_temp_storage == NULL)
{
if (d_temp_storage == nullptr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (d_temp_storage == nullptr)
if (!d_temp_storage)


reduce_agent ra(reduce_plan, num_items, stream, vshmem_ptr, "reduce_agent: single_tile only", debug_sync);
ra.launch(input_it, output_it, num_items, reduction_op);
char *vshmem_ptr = vshmem_size > 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you have char *const. We consolidate all uses


size_t vshmem_storage = core::vshmem_size(partition_plan.shared_memory_size,
num_tiles);
if (!d_temp_storage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Change requested:

There is a whole lot of duplication with respect to the partitioning of the work. I am wondering whether we can extract that common functionality into a common helper function.

Not in this PR though

for (size_t i = 0; i < n; i++)
{
// XXX Use proper random number generation facility.
h_keys[i] = FixedVector<T, N>(rand());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to take the opportunity and use something from <random>

static const cub::CacheLoadModifier LOAD_MODIFIER = _LOAD_MODIFIER;
static const cub::BlockScanAlgorithm SCAN_ALGORITHM = _SCAN_ALGORITHM;
}; // struct PtxPolicy
static constexpr int BLOCK_THREADS = _BLOCK_THREADS;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather rename the template arguments than store additional static variables

CUB_MIN(NOMINAL_4B_ITEMS_PER_THREAD,
CUB_MAX(1, (NOMINAL_4B_ITEMS_PER_THREAD * 4 / sizeof(T))));

using Policy = PtxPolicy<128,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give the magic 128 a name?

@jrhemstad
Copy link
Collaborator

@allisonvacanti and I talked and we agreed to close this PR for now as this work has been de-prioritized. We will pick it back up if/when it becomes a priority again.

@jrhemstad jrhemstad closed this May 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

P0: must have Absolutely necessary. Critical issue, major blocker, etc. release: breaking change Include in "Breaking Changes" section of release notes. testing: gpuCI in progress Started gpuCI testing. type: enhancement New feature or request.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants