Skip to content

Commit cdab611

Browse files
author
Deniz Saner
authored
Added vocab and vocab_size to CBOW exercise
Introduced a variable called `vocab` with a value of `set(raw_text)` and `vocab_size` which holds the length of `vocab`
1 parent 7b148a6 commit cdab611

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

beginner_source/nlp/word_embeddings_tutorial.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,12 @@ def forward(self, inputs):
309309
The evolution of a process is directed by a pattern of rules
310310
called a program. People create programs to direct processes. In effect,
311311
we conjure the spirits of the computer with our spells.""".split()
312-
word_to_ix = {word: i for i, word in enumerate(raw_text)}
312+
313+
# By deriving a set from `raw_text`, we deduplicate the array
314+
vocab = set(raw_text)
315+
vocab_size = len(vocab)
316+
317+
word_to_ix = {word: i for i, word in enumerate(vocab)}
313318
data = []
314319
for i in range(2, len(raw_text) - 2):
315320
context = [raw_text[i - 2], raw_text[i - 1],

0 commit comments

Comments
 (0)