Update README.md
Browse files
README.md
CHANGED
@@ -13,9 +13,9 @@ license: apache-2.0
|
|
13 |
|
14 |
This model was trained as part of the analysis and experiments performed in preparation of the release of the [Institutional Books 1.0 dataset](https://huggingface.co/collections/instdin/institutional-books-68366258bfb38364238477cf).
|
15 |
|
16 |
-
|
17 |
|
18 |
-
Complete experimental setup and results are available in our [technical report]() (Section 4.5).
|
19 |
|
20 |
## Base model
|
21 |
[google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)
|
@@ -57,8 +57,7 @@ All of the fields listed in this example are optional.
|
|
57 |
## Training data
|
58 |
- Train split: 80,830 samples
|
59 |
- Test split: 5,000 samples
|
60 |
-
|
61 |
-
An additional set of 1,000 samples was set aside for benchmarking purposes.
|
62 |
|
63 |
## Validation Metrics
|
64 |
| Metric | Value |
|
@@ -75,7 +74,7 @@ An additional set of 1,000 samples was set aside for benchmarking purposes.
|
|
75 |
| recall_weighted | 0.9694 |
|
76 |
| accuracy | 0.9694 |
|
77 |
|
78 |
-
**
|
79 |
|
80 |
## Cite
|
81 |
```
|
|
|
13 |
|
14 |
This model was trained as part of the analysis and experiments performed in preparation of the release of the [Institutional Books 1.0 dataset](https://huggingface.co/collections/instdin/institutional-books-68366258bfb38364238477cf).
|
15 |
|
16 |
+
We used this text classifier to assign 1 of 20 topics, derived from the first level of the [Library of Congress' Classification Outline](https://www.loc.gov/catdir/cpso/lcco/), to individual volumes.
|
17 |
|
18 |
+
Complete experimental setup and results are available in our [technical report](TBD) (Section 4.5).
|
19 |
|
20 |
## Base model
|
21 |
[google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)
|
|
|
57 |
## Training data
|
58 |
- Train split: 80,830 samples
|
59 |
- Test split: 5,000 samples
|
60 |
+
- An additional set of 1,000 samples was set aside for benchmarking purposes
|
|
|
61 |
|
62 |
## Validation Metrics
|
63 |
| Metric | Value |
|
|
|
74 |
| recall_weighted | 0.9694 |
|
75 |
| accuracy | 0.9694 |
|
76 |
|
77 |
+
**Post-training benchmark accuracy:** 97.2% (920)
|
78 |
|
79 |
## Cite
|
80 |
```
|