Count OpenAI tokens from the command line. Useful for estimating API costs and staying within context limits.
pip install tiktoken
curl -o tiktoken_cli.py https://raw.githubusercontent.com/himalaya0x/tiktoken-cli/master/tiktoken_cli.py# Count tokens in a string
python3 tiktoken_cli.py "Hello, world!"
# Count tokens in a file
python3 tiktoken_cli.py -f prompt.txt
# Specify model encoding
python3 tiktoken_cli.py -m gpt-4 "What is the meaning of life?"
# Read from stdin
cat document.txt | python3 tiktoken_cli.py
# Show token IDs
python3 tiktoken_cli.py --ids "Hello, world!"
# Compare across encodings
python3 tiktoken_cli.py --compare "Hello, world!"$ python3 tiktoken_cli.py -m gpt-4 "What is the meaning of life?"
Model: gpt-4
Encoding: cl100k_base
Tokens: 7
Chars: 29
Ratio: 4.1 chars/token
All OpenAI model encodings via tiktoken:
gpt-4/gpt-4-turbo/gpt-4o→cl100k_base(default)gpt-3.5-turbo→cl100k_basetext-davinci-003→p50k_basetext-embedding-ada-002→cl100k_base
MIT