A different implementation for car rental problem #166

kevinnewgame · 2024-09-13T05:22:23Z

By saving probability models in numpy array, the DP algorithm could be very efficient without manually parallel computing.
The running time is short and it's about 2 seconds to generate the plots in the book.
The idea is to separate reward and value in Bellman equation and calculate their expected terms separately. Since we have the model in DP problem, we can save the expected reward and probability of state transitions before policy updating.

By saving probability models in numpy array, the DP algorithm could be very efficient without manually parallel computing.

A different implementation for car rental problem

c0b1628

By saving probability models in numpy array, the DP algorithm could be very efficient without manually parallel computing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A different implementation for car rental problem #166

A different implementation for car rental problem #166

Uh oh!

kevinnewgame commented Sep 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

A different implementation for car rental problem #166

Are you sure you want to change the base?

A different implementation for car rental problem #166

Uh oh!

Conversation

kevinnewgame commented Sep 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant