# How to convert word into vector with GloVe

`GloVe` is an unsupervised learning algorithm for obtaining vector representations for word.In this tutorial we will see how to generate vector for a given word.

###### Prerequisites
• Run below code snippets with `Google Colab` or `Jupyter Notebook` as we have to execute some `Unix` commands as well.

Download `GloVe` pre-trained vectors

``````
!wget http://nlp.stanford.edu/data/glove.6B.zip

```
```

``````
!unzip glove.6B.zip
```
```

You should see output similar to the below one

``````
Archive:  glove.6B.zip
inflating: glove.6B.50d.txt
inflating: glove.6B.100d.txt
inflating: glove.6B.200d.txt
inflating: glove.6B.300d.txt
``````

View files after unzipping

``````
!ls -lrt
```
```

You should see output similar to the below one

``````
total 3039136
-rw-rw-r-- 1 root root  347116733 Aug  4  2014 glove.6B.100d.txt
-rw-rw-r-- 1 root root  693432828 Aug  4  2014 glove.6B.200d.txt
-rw-rw-r-- 1 root root  171350079 Aug  4  2014 glove.6B.50d.txt
-rw-rw-r-- 1 root root 1037962819 Aug 27  2014 glove.6B.300d.txt
-rw-r--r-- 1 root root  862182613 Oct 25  2015 glove.6B.zip
``````

From the output we can see there are four text files `glove.6B.100d.txt`, `glove.6B.200d.txt`, `glove.6B.50d.txt` and `glove.6B.300d.txt`. These text files generate vectors of length 100, 200, 50 and 300 respectively , for this tutorial we will generate vectors of length 50, so we will use `glove.6B.50d.txt` .

Create a dictionary having word as key and vector as value for all of the entries in file `glove.6B.50d.txt`

``````
import os
import numpy as np

# Create Empty dictionary
word2vector = {}

#Create a dictionary with word and corresponding vector
with open(os.path.join('./glove.6B.50d.txt')) as file:

for line in file:
list_of_values = line.split()
word = list_of_values[0]
vector_of_word = np.asarray(list_of_values[1:], dtype='float32')
word2vector[word] = vector_of_word

msg = f"Total number of words and corresponding vectors in word2vectors are {len(word2vector)}"
print(msg)

# View first record in word2vector dictionary
for word, vector in word2vector.items():
print(word)
print(vector)
print(vector.shape)
break
```
```

You should see output similar to the below one

``````
Total number of words and corresponding vectors in word2vectors are 400000
the
[ 4.1800e-01  2.4968e-01 -4.1242e-01  1.2170e-01  3.4527e-01 -4.4457e-02
-4.9688e-01 -1.7862e-01 -6.6023e-04 -6.5660e-01  2.7843e-01 -1.4767e-01
-5.5677e-01  1.4658e-01 -9.5095e-03  1.1658e-02  1.0204e-01 -1.2792e-01
-8.4430e-01 -1.2181e-01 -1.6801e-02 -3.3279e-01 -1.5520e-01 -2.3131e-01
-1.9181e-01 -1.8823e+00 -7.6746e-01  9.9051e-02 -4.2125e-01 -1.9526e-01
4.0071e+00 -1.8594e-01 -5.2287e-01 -3.1681e-01  5.9213e-04  7.4449e-03
1.7778e-01 -1.5897e-01  1.2041e-02 -5.4223e-02 -2.9871e-01 -1.5749e-01
-3.4758e-01 -4.5637e-02 -4.4251e-01  1.8785e-01  2.7849e-03 -1.8411e-01
-1.1514e-01 -7.8581e-01]
(50,)
``````

Generate word vectors for a sentence

``````
#Sample sentence with four words
sample_sentence  = "convert word into vectors"

for i in sample_sentence.split():
print(i, word2vector[i], word2vector[i].shape)
```
```

You should see output similar to the below one

``````
convert [ 0.33661   -0.51168    0.87064   -0.95326    0.74852    0.12839
-0.37988    0.10754   -0.36786    1.4141     0.62383    0.45762
0.62611   -0.11105   -0.41305    0.67618    0.43104   -0.57291
0.016154  -0.0049896  0.40332   -0.59646   -0.43036    0.20764
-0.1147    -0.99394    0.68397   -1.089      0.51071   -0.37707
2.0347    -0.13211   -0.35318    0.01808    0.40005   -0.13595
-0.058802   0.073057   0.12816    0.0078398  0.70848   -0.36644
0.25745    0.75544   -0.037074   0.50653   -0.055351  -0.20353
-0.37791   -0.67328  ] (50,)
word [-0.1643     0.15722   -0.55021   -0.3303     0.66463   -0.1152
-0.2261    -0.23674   -0.86119    0.24319    0.074499   0.61081
0.73683   -0.35224    0.61346    0.0050975 -0.62538   -0.0050458
0.18392   -0.12214   -0.65973   -0.30673    0.35038    0.75805
1.0183    -1.7424    -1.4277     0.38032    0.37713   -0.74941
2.9401    -0.8097    -0.66901    0.23123   -0.073194  -0.13624
0.24424   -1.0129    -0.24919   -0.06893    0.70231   -0.022177
-0.64684    0.59599    0.027092   0.11203    0.61214    0.74339
0.23572   -0.1369   ] (50,)
into [ 6.6749e-01 -4.1321e-01  6.5755e-02 -4.6653e-01  2.7619e-04  1.8348e-01
-6.5269e-01  9.3383e-02 -8.6802e-03 -1.8874e-01 -6.3057e-03  4.4894e-02
-6.6801e-01  4.8506e-01 -1.1850e-01  1.9968e-01  1.8180e-01  3.3144e-02
-5.9108e-01 -2.1829e-01  4.1438e-01  5.6740e-02  4.2155e-01  2.7798e-01
-1.1322e-01 -1.9227e+00  3.5513e-02  6.1928e-01  6.2206e-01 -6.3987e-01
3.9115e+00 -2.1078e-02 -2.4685e-01 -1.3922e-01 -2.2545e-01  5.9131e-01
-7.3220e-01  1.1620e-01  4.1550e-01 -1.5188e-01 -1.4933e-01  4.0739e-02
-1.0415e-01  2.3733e-01 -4.3800e-01  6.0590e-02  5.5073e-01 -9.6571e-01
-2.6875e-01 -1.1741e+00] (50,)
vectors [ 1.3247   -0.38281   0.27162   0.82353   1.7431    0.63094   1.9888
-1.0854    0.97619  -0.79769   0.70562   0.28915  -0.44682   0.16009
-0.25901  -0.35215   0.10791  -0.71015  -0.80975  -0.70704  -1.0186
-1.619     0.93473   1.1258   -0.22782   0.71059   0.22179  -0.42324
0.61644   0.30039   1.1298    0.075558  0.049487 -0.40429  -0.4642
-0.41281   0.193     0.29502  -0.74731   1.3598    1.2449    0.30083
-0.63276   1.5004   -0.30381   0.21208   1.1786   -0.036461 -0.3919
0.71549 ] (50,)
``````

