Skip to content

Conversation

jalateras
Copy link

@jalateras jalateras commented Sep 12, 2025

Summary

  • Created a unified token_counting_utils.py module to resolve inconsistencies between duplicate implementations
  • Consolidated token counting logic from both notebooks into a single, maintainable function
  • Added support for all current OpenAI models including latest gpt-4o variants

Changes

  • New utility module: Created examples/utils/token_counting_utils.py with consolidated implementation
  • Updated notebooks: Modified both How_to_format_inputs_to_ChatGPT_models.ipynb and How_to_count_tokens_with_tiktoken.ipynb to import from shared utility
  • Enhanced model support: Unified function now supports all current models with appropriate encoding (cl100k_base for older models, o200k_base for newer ones)

Test Plan

  • Verified function works with all supported models (gpt-3.5-turbo, gpt-4, gpt-4o, gpt-4o-mini variants)
  • Tested token counting produces expected results across different model families
  • Confirmed both notebooks can successfully import and use the shared utility function

Fixes #2134

Resolves inconsistency between two different implementations of
num_tokens_from_messages in the cookbook notebooks by creating a
unified utility function that supports all current OpenAI models.

- Created shared token_counting_utils.py module in examples/utils/
- Consolidated logic from both notebook versions into single function
- Added support for all current models including gpt-4o variants
- Updated both notebooks to import from shared utility module
- Maintains backward compatibility with existing code

This ensures consistent token counting across all cookbook examples
and makes it easier to maintain model support in one location.
@jalateras jalateras force-pushed the fix/consolidate-token-counting-2134 branch from 25738b1 to 8b1e956 Compare September 12, 2025 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[PROBLEM] num_tokens_from_messages versions inconsistent
1 participant