Skip to content

Commit f8b3d50

Browse files
committed
update documentation
1 parent 918f91d commit f8b3d50

File tree

5 files changed

+35
-24
lines changed

5 files changed

+35
-24
lines changed

README.md

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ nltk.download('averaged_perceptron_tagger')
6464
Use gensim to load a pre-trained word2vec model. Like [Google News from Google drive](https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit).
6565
```python
6666
import gensim
67-
model = gensim.models.Word2Vec.load_word2vec_format('./GoogleNews-vectors-negative300.bin', binary=True)
67+
model = gensim.models.KeyedVectors.load_word2vec_format('./GoogleNews-vectors-negative300.bin', binary=True)
6868
```
6969
You can also use gensim to load Facebook's Fasttext [English](https://fasttext.cc/docs/en/english-vectors.html) and [Multilingual models](https://fasttext.cc/docs/en/crawl-vectors.html)
7070
```
@@ -103,6 +103,11 @@ There are three types of augmentations which can be used:
103103
```python
104104
from textaugment import Word2vec
105105
```
106+
- fasttext
107+
108+
```python
109+
from textaugment import Fasttext
110+
```
106111

107112
- wordnet
108113
```python
@@ -112,17 +117,20 @@ from textaugment import Wordnet
112117
```python
113118
from textaugment import Translate
114119
```
115-
#### Word2vec-based augmentation
120+
#### Fasttext/Word2vec-based augmentation
116121

117122
[See this notebook for an example](https://github.com/dsfsi/textaugment/blob/master/examples/word2vec_example.ipynb)
118123

119124
**Basic example**
120125

121126
```python
122-
>>> from textaugment import Word2vec
127+
>>> from textaugment import Word2vec, Fasttext
123128
>>> t = Word2vec(model='path/to/gensim/model'or 'gensim model itself')
124129
>>> t.augment('The stories are good')
125130
The films are good
131+
>>> t = Fasttext(model='path/to/gensim/model'or 'gensim model itself')
132+
>>> t.augment('The stories are good')
133+
The films are good
126134
```
127135
**Advanced example**
128136

@@ -131,8 +139,11 @@ The films are good
131139
>>> v = False # verbose mode to replace all the words. If enabled runs is not effective. Used in this paper (https://www.cs.cmu.edu/~diyiy/docs/emnlp_wang_2015.pdf)
132140
>>> p = 0.5 # The probability of success of an individual trial. (0.1<p<1.0), default is 0.5. Used by Geometric distribution to selects words from a sentence.
133141

134-
>>> t = Word2vec(model='path/to/gensim/model'or'gensim model itself', runs=5, v=False, p=0.5)
135-
>>> t.augment('The stories are good')
142+
>>> word = Word2vec(model='path/to/gensim/model'or'gensim model itself', runs=5, v=False, p=0.5)
143+
>>> word.augment('The stories are good', top_n=10)
144+
The movies are excellent
145+
>>> fast = Fasttext(model='path/to/gensim/model'or'gensim model itself', runs=5, v=False, p=0.5)
146+
>>> fast.augment('The stories are good', top_n=10)
136147
The movies are excellent
137148
```
138149
#### WordNet-based augmentation
@@ -155,7 +166,7 @@ In the afternoon, John is walking to town
155166
>>> p = 0.5 # The probability of success of an individual trial. (0.1<p<1.0), default is 0.5. Used by Geometric distribution to selects words from a sentence.
156167

157168
>>> t = Wordnet(v=False ,n=True, p=0.5)
158-
>>> t.augment('In the afternoon, John is going to town')
169+
>>> t.augment('In the afternoon, John is going to town', top_n=10)
159170
In the afternoon, Joseph is going to town.
160171
```
161172
#### RTT-based augmentation
@@ -183,7 +194,7 @@ one of its synonyms chosen at random.
183194
```python
184195
>>> from textaugment import EDA
185196
>>> t = EDA()
186-
>>> t.synonym_replacement("John is going to town")
197+
>>> t.synonym_replacement("John is going to town", top_n=10)
187198
John is give out to town
188199
```
189200

examples/aeda_example.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@
9191
"name": "python",
9292
"nbconvert_exporter": "python",
9393
"pygments_lexer": "ipython3",
94-
"version": "3.8.15"
94+
"version": "3.7.7"
9595
}
9696
},
9797
"nbformat": 4,

examples/eda_example.ipynb

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
},
2323
{
2424
"cell_type": "code",
25-
"execution_count": 7,
25+
"execution_count": 2,
2626
"metadata": {},
2727
"outputs": [],
2828
"source": [
@@ -39,19 +39,19 @@
3939
},
4040
{
4141
"cell_type": "code",
42-
"execution_count": 8,
42+
"execution_count": 3,
4343
"metadata": {},
4444
"outputs": [
4545
{
4646
"name": "stdout",
4747
"output_type": "stream",
4848
"text": [
49-
"John is going to townspeople\n"
49+
"John is choke to town\n"
5050
]
5151
}
5252
],
5353
"source": [
54-
"output = t.synonym_replacement(\"John is going to town\")\n",
54+
"output = t.synonym_replacement(\"John is going to town\", top_n=10)\n",
5555
"print(output)"
5656
]
5757
},
@@ -65,14 +65,14 @@
6565
},
6666
{
6767
"cell_type": "code",
68-
"execution_count": 9,
68+
"execution_count": 4,
6969
"metadata": {},
7070
"outputs": [
7171
{
7272
"name": "stdout",
7373
"output_type": "stream",
7474
"text": [
75-
"John is going st john to town\n"
75+
"John is going to lead town\n"
7676
]
7777
}
7878
],
@@ -91,14 +91,14 @@
9191
},
9292
{
9393
"cell_type": "code",
94-
"execution_count": 10,
94+
"execution_count": 5,
9595
"metadata": {},
9696
"outputs": [
9797
{
9898
"name": "stdout",
9999
"output_type": "stream",
100100
"text": [
101-
"John to going is town\n"
101+
"John is to going town\n"
102102
]
103103
}
104104
],
@@ -117,14 +117,14 @@
117117
},
118118
{
119119
"cell_type": "code",
120-
"execution_count": 11,
120+
"execution_count": 6,
121121
"metadata": {},
122122
"outputs": [
123123
{
124124
"name": "stdout",
125125
"output_type": "stream",
126126
"text": [
127-
"going to town\n"
127+
"John going to town\n"
128128
]
129129
}
130130
],

examples/fasttext_example.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,8 @@
7979
"outputs": [],
8080
"source": [
8181
"from textaugment import Word2vec\n",
82-
"t = Word2vec(model = model.wv)\n",
83-
"output = t.augment('The stories are good')"
82+
"t = Word2vec(model = model)\n",
83+
"output = t.augment('The stories are good', top_n=10)"
8484
]
8585
},
8686
{
@@ -132,9 +132,9 @@
132132
"name": "python",
133133
"nbconvert_exporter": "python",
134134
"pygments_lexer": "ipython3",
135-
"version": "3.7.3"
135+
"version": "3.7.7"
136136
}
137137
},
138138
"nbformat": 4,
139-
"nbformat_minor": 2
139+
"nbformat_minor": 4
140140
}

examples/word2vec_example.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
},
1313
{
1414
"cell_type": "code",
15-
"execution_count": null,
15+
"execution_count": 1,
1616
"metadata": {
1717
"colab": {},
1818
"colab_type": "code",
@@ -170,7 +170,7 @@
170170
"source": [
171171
"from textaugment import Word2vec\n",
172172
"t = Word2vec(model=model)\n",
173-
"output = t.augment('The stories are good')"
173+
"output = t.augment('The stories are good', top_n=10)"
174174
]
175175
},
176176
{

0 commit comments

Comments
 (0)