You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use gensim to load a pre-trained word2vec model. Like [Google News from Google drive](https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit).
65
65
```python
66
66
import gensim
67
-
model = gensim.models.Word2Vec.load_word2vec_format('./GoogleNews-vectors-negative300.bin', binary=True)
67
+
model = gensim.models.KeyedVectors.load_word2vec_format('./GoogleNews-vectors-negative300.bin', binary=True)
68
68
```
69
69
You can also use gensim to load Facebook's Fasttext [English](https://fasttext.cc/docs/en/english-vectors.html) and [Multilingual models](https://fasttext.cc/docs/en/crawl-vectors.html)
70
70
```
@@ -103,6 +103,11 @@ There are three types of augmentations which can be used:
103
103
```python
104
104
from textaugment import Word2vec
105
105
```
106
+
- fasttext
107
+
108
+
```python
109
+
from textaugment import Fasttext
110
+
```
106
111
107
112
- wordnet
108
113
```python
@@ -112,17 +117,20 @@ from textaugment import Wordnet
112
117
```python
113
118
from textaugment import Translate
114
119
```
115
-
#### Word2vec-based augmentation
120
+
#### Fasttext/Word2vec-based augmentation
116
121
117
122
[See this notebook for an example](https://github.com/dsfsi/textaugment/blob/master/examples/word2vec_example.ipynb)
118
123
119
124
**Basic example**
120
125
121
126
```python
122
-
>>>from textaugment import Word2vec
127
+
>>>from textaugment import Word2vec, Fasttext
123
128
>>> t = Word2vec(model='path/to/gensim/model'or'gensim model itself')
124
129
>>> t.augment('The stories are good')
125
130
The films are good
131
+
>>> t = Fasttext(model='path/to/gensim/model'or'gensim model itself')
132
+
>>> t.augment('The stories are good')
133
+
The films are good
126
134
```
127
135
**Advanced example**
128
136
@@ -131,8 +139,11 @@ The films are good
131
139
>>> v =False# verbose mode to replace all the words. If enabled runs is not effective. Used in this paper (https://www.cs.cmu.edu/~diyiy/docs/emnlp_wang_2015.pdf)
132
140
>>> p =0.5# The probability of success of an individual trial. (0.1<p<1.0), default is 0.5. Used by Geometric distribution to selects words from a sentence.
133
141
134
-
>>> t = Word2vec(model='path/to/gensim/model'or'gensim model itself', runs=5, v=False, p=0.5)
135
-
>>> t.augment('The stories are good')
142
+
>>> word = Word2vec(model='path/to/gensim/model'or'gensim model itself', runs=5, v=False, p=0.5)
143
+
>>> word.augment('The stories are good', top_n=10)
144
+
The movies are excellent
145
+
>>> fast = Fasttext(model='path/to/gensim/model'or'gensim model itself', runs=5, v=False, p=0.5)
146
+
>>> fast.augment('The stories are good', top_n=10)
136
147
The movies are excellent
137
148
```
138
149
#### WordNet-based augmentation
@@ -155,7 +166,7 @@ In the afternoon, John is walking to town
155
166
>>> p =0.5# The probability of success of an individual trial. (0.1<p<1.0), default is 0.5. Used by Geometric distribution to selects words from a sentence.
156
167
157
168
>>> t = Wordnet(v=False ,n=True, p=0.5)
158
-
>>> t.augment('In the afternoon, John is going to town')
169
+
>>> t.augment('In the afternoon, John is going to town', top_n=10)
159
170
In the afternoon, Joseph is going to town.
160
171
```
161
172
#### RTT-based augmentation
@@ -183,7 +194,7 @@ one of its synonyms chosen at random.
183
194
```python
184
195
>>>from textaugment importEDA
185
196
>>> t = EDA()
186
-
>>> t.synonym_replacement("John is going to town")
197
+
>>> t.synonym_replacement("John is going to town", top_n=10)
0 commit comments