Skip to content

Conversation

@KrishnanPrash
Copy link
Contributor

@KrishnanPrash KrishnanPrash commented Nov 12, 2025

Overview:

With #3988, we have functional image decoding in the frontend for any b64 or http urls passed with the inference request. This PR builds on top of #3988, and implements the nixl read() portion of the image decoding workflow for the backend.

Details:

Look at handlers.py for the additions to the DECODED workflow.

milesial and others added 14 commits November 10, 2025 14:18
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Alexandre Milesi <[email protected]>
Signed-off-by: Krishnan Prashanth <[email protected]>
Signed-off-by: Krishnan Prashanth <[email protected]>
@KrishnanPrash KrishnanPrash requested review from a team as code owners November 12, 2025 21:06
@KrishnanPrash KrishnanPrash marked this pull request as draft November 12, 2025 21:06
@github-actions github-actions bot added the feat label Nov 12, 2025
@KrishnanPrash KrishnanPrash reopened this Nov 13, 2025
Signed-off-by: Krishnan Prashanth <[email protected]>
Comment on lines +127 to +166
async def _read_decoded_image_via_nixl(
self, decoded_meta: Dict[str, Any]
) -> PIL.Image.Image:
"""Read decoded image via NIXL RDMA and convert to PIL.Image."""
# Lazy-init connector
if self._connector is None:
self._connector = connect.Connector()
await self._connector.initialize()
logger.info("NIXL connector initialized for decoded media")

# Extract fields
meta_str = decoded_meta["nixl_metadata"]
desc = decoded_meta["nixl_descriptor"]
shape = decoded_meta["shape"]

# Create tensor to receive RDMA data
tensor = torch.empty(shape, dtype=torch.uint8)

# Build RdmaMetadata from frontend-provided descriptor
# Frontend sends compressed metadata (matches Python nixl_connect)
rdma_meta = RdmaMetadata(
descriptors=[
SerializedDescriptor(
device="cpu"
if desc.get("mem_type") == "Dram"
else f"cuda:{desc.get('device_id', 0)}",
ptr=desc["addr"],
size=desc["size"],
)
],
nixl_metadata=meta_str,
notification_key=f"img-{shape}",
operation_kind=int(OperationKind.READ),
)

# RDMA read
read_op = await self._connector.begin_read(
rdma_meta, connect.Descriptor(tensor)
)
await read_op.wait_for_completion()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a NIXL expert, so please let me know if I can be doing anything here better.

Comment on lines 125 to 129
// Compress metadata before base64 encoding (matches Python nixl_connect behavior)
// Backend expects: b64:<base64_of_compressed_bytes>
let mut encoder = ZlibEncoder::new(Vec::new(), Compression::new(6));
encoder.write_all(&nixl_md)?;
let compressed = encoder.finish()?;
Copy link
Contributor Author

@KrishnanPrash KrishnanPrash Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, welcome any suggestions on correct nixl usage.

@KrishnanPrash
Copy link
Contributor Author

KrishnanPrash commented Nov 13, 2025

Open Question for Testing:

Ideally, we would like to test both test cases:

  1. Frontend URL pass through + backend decoding: This requires building without nixl.
  2. Frontend decoding + backend nixl read: This requires building dynamo with the command maturin develop --features media-nixl

Based on my conversation with @nv-tusharma, IIUC they suggested creating a separate workflow outside .github/workflows/container-backends-validation.yaml that would be a non-blocking test that would still run in our current CI.

@KrishnanPrash KrishnanPrash marked this pull request as ready for review November 13, 2025 23:54
@KrishnanPrash KrishnanPrash changed the title feat: Adding nixl read support for decoded path feat: Adding nixl read() multimodal support for vLLM backend Nov 13, 2025
@rmccorm4 rmccorm4 requested a review from ayushag-nv November 14, 2025 18:57

# Build RdmaMetadata from frontend-provided descriptor
# Frontend sends compressed metadata (matches Python nixl_connect)
rdma_meta = RdmaMetadata(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work, have you tested it?

the "normal flow" is to create a passive operation (ReadableOp or WritableOp) and use their .metadata property to get the set of SerializedDescriptors and not manually compose this.

given that this is using an active operation (ReadOp) it should be taking in the metadata to perform the read, not sending the metadata.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually metadata comes from the secondary connection, which in turn got it from its ReadableOperation.


# Extract fields
meta_str = decoded_meta["nixl_metadata"]
desc = decoded_meta["nixl_descriptor"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what type is desc?

# Frontend sends compressed metadata (matches Python nixl_connect)
rdma_meta = RdmaMetadata(
descriptors=[
SerializedDescriptor(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not pass desc to a Descriptor and then serialize the descriptor to get the metadata?

Base automatically changed from alexandrem/frontend-image-decoding-nixl to main November 25, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants