-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Description
Issue description
I tried to execute a slightly modified version of this script ( no significative changes were made ) for an embedding with a large vocabulary and 600 dimensions:
from nncompress import EmbeddingCompressor
# Load my embedding matrix
matrix = np.load("data/glove.6B.300d.npy")
# Initialize the compressor
compressor = EmbeddingCompressor(32, 16, "data/mymodel")
# Train the quantization model
compressor.train(matrix)
# Evaluate
distance = compressor.evaluate(matrix)
print("Mean euclidean distance:", distance)
# Export the codes and codebook
compressor.export(matrix, "data/mymodel")
But then, this is what I got:
Traceback (most recent call last):
File "compress.py", line 82, in <module>
pipe\
File "compress.py", line 70, in train
compressor.train(matrix)
File "/home/user/summer/smallnilc/nncompress/embed_compress.py", line 159, in train
word_ids_var, loss_op, train_op, maxp_op = self.build_training_graph(embed_matrix)
File "/home/user/summer/smallnilc/nncompress/embed_compress.py", line 114, in build_training_graph
input_matrix = tf.constant(embed_matrix, name="embed_matrix")
File "/home/user/summer/smallnilc/small/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 180, in constant_v1
allow_broadcast=False)
File "/home/user/summer/smallnilc/small/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 284, in _constant_impl
allow_broadcast=allow_broadcast))
File "/home/user/summer/smallnilc/small/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 537, in make_tensor_proto
"Cannot create a tensor proto whose content is larger than 2GB.")
ValueError: Cannot create a tensor proto whose content is larger than 2GB.
Tensorflow devs have answered issues similar to this one by saying that the only solution is to rewrite your code in a way that it doesn't break the hard limit of 2GB imposed by protobuf.
Steps to reproduce the issue
Simply try to compress an embedding above 300 dimensions ( either 600 or 1000 dimensions ).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels