krinal commited on
Commit
2b28239
·
verified ·
1 Parent(s): 57dca78

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: token-classification
12
 
13
  - tokenizer for hindi language
14
 
15
- #### usage
16
 
17
  ```py
18
  from transformers import AutoTokenizer
@@ -28,18 +28,21 @@ encoded_str = hi_tokenizer.encode(hi_str)
28
  decoded_str = hi_tokenizer.decode(encoded_str)
29
  ```
30
 
31
- #### language
32
 
33
  - hi
34
 
35
- #### dataset
 
 
 
 
36
 
37
  - trained on BHAAV (hi sentiment analysis dataset)
38
  - dataset source: [Bhaav](https://github.com/midas-research/bhaav)
39
  - Hindi text corpus (20,304 sentences)
40
 
41
-
42
- #### citation
43
 
44
  ```shell
45
  @article{kumar2019bhaav,
 
12
 
13
  - tokenizer for hindi language
14
 
15
+ #### Usage
16
 
17
  ```py
18
  from transformers import AutoTokenizer
 
28
  decoded_str = hi_tokenizer.decode(encoded_str)
29
  ```
30
 
31
+ #### Language
32
 
33
  - hi
34
 
35
+ #### Training
36
+
37
+ - For training see [Train BertWordPieceTokenizer](https://gist.github.com/kjdeveloper8/57d9e16848cd77df778804c9e2214a78)
38
+
39
+ #### Dataset
40
 
41
  - trained on BHAAV (hi sentiment analysis dataset)
42
  - dataset source: [Bhaav](https://github.com/midas-research/bhaav)
43
  - Hindi text corpus (20,304 sentences)
44
 
45
+ #### Citation
 
46
 
47
  ```shell
48
  @article{kumar2019bhaav,