artemkramov commited on
Commit
0ab75dd
·
1 Parent(s): b56fe65

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -71
README.md CHANGED
@@ -21,114 +21,93 @@ using the [F-Coref](https://arxiv.org/abs/2209.04280) library. The model was tra
21
 
22
  <!-- Provide a longer summary of what this model is. -->
23
 
24
-
25
-
26
  - **Developed by:** [Artem Kramov](https://www.linkedin.com/in/artem-kramov-0b3731100/), Andrii Kursin ([email protected]).
27
  - **Languages:** Ukrainian
28
  - **Finetuned from model:** [XML-Roberta-base](https://huggingface.co/ukr-models/xlm-roberta-base-uk)
29
 
30
- ### Model Sources [optional]
31
 
32
  <!-- Provide the basic links for the model. -->
33
 
34
- - **Repository:** [More Information Needed]
35
- - **Paper [optional]:** [More Information Needed]
36
- - **Demo [optional]:** [More Information Needed]
37
-
38
- ## Uses
39
-
40
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
41
-
42
- ### Direct Use
43
-
44
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
45
-
46
- [More Information Needed]
47
-
48
- ### Downstream Use [optional]
49
-
50
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
51
-
52
- [More Information Needed]
53
 
54
  ### Out-of-Scope Use
55
 
56
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
 
57
 
58
- [More Information Needed]
59
-
60
- ## Bias, Risks, and Limitations
61
-
62
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
63
-
64
- [More Information Needed]
65
-
66
- ### Recommendations
67
-
68
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
69
-
70
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
71
 
72
  ## How to Get Started with the Model
73
 
74
  Use the code below to get started with the model.
75
 
76
- [More Information Needed]
 
 
77
 
78
- ## Training Details
79
 
80
- ### Training Data
 
81
 
82
- <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
83
 
84
- [More Information Needed]
85
-
86
- ### Training Procedure
87
-
88
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
89
-
90
- #### Preprocessing [optional]
91
-
92
- [More Information Needed]
93
 
 
 
94
 
95
- #### Training Hyperparameters
 
 
96
 
97
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
98
 
99
- #### Speeds, Sizes, Times [optional]
100
 
101
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
102
 
103
- [More Information Needed]
104
 
105
  ## Evaluation
106
 
107
  <!-- This section describes the evaluation protocols and provides the results. -->
108
 
109
- ### Testing Data, Factors & Metrics
110
-
111
- #### Testing Data
112
-
113
- <!-- This should link to a Data Card if possible. -->
114
-
115
- [More Information Needed]
116
-
117
- #### Factors
118
-
119
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
120
-
121
- [More Information Needed]
122
-
123
  #### Metrics
124
 
125
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
126
 
127
- [More Information Needed]
 
 
 
 
 
 
 
 
128
 
129
  ### Results
130
 
131
- [More Information Needed]
 
 
 
 
 
 
 
 
 
132
 
133
  #### Summary
134
 
 
21
 
22
  <!-- Provide a longer summary of what this model is. -->
23
 
 
 
24
  - **Developed by:** [Artem Kramov](https://www.linkedin.com/in/artem-kramov-0b3731100/), Andrii Kursin ([email protected]).
25
  - **Languages:** Ukrainian
26
  - **Finetuned from model:** [XML-Roberta-base](https://huggingface.co/ukr-models/xlm-roberta-base-uk)
27
 
28
+ ### Model Sources
29
 
30
  <!-- Provide the basic links for the model. -->
31
 
32
+ - **Repository:** https://github.com/artemkramov/fastcoref-ua/blob/main/README.md
33
+ - **Demo:** [Google Colab](https://colab.research.google.com/drive/1vsaH15DFDrmKB4aNsQ-9TCQGTW73uk1y?usp=sharing)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ### Out-of-Scope Use
36
 
37
+ According to the metrics retrieved from the evaluation dataset, the model is more precision-oriented. Also, there is a high level of granularity of mentions.
38
+ E.g., the mention "Головний виконавчий директор Андрій Сидоренко" can be divided into the following coreferent groups: ["Головний виконавчий директор Андрій Сидоренко", "Головний виконавчий директор", "Андрій Сидоренко"].
39
+ Such a feature can also be used to extract some positions, roles, or other features of entities in the text.
40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  ## How to Get Started with the Model
43
 
44
  Use the code below to get started with the model.
45
 
46
+ ```python
47
+ from fastcoref import FCoref
48
+ import spacy
49
 
50
+ nlp = spacy.load('uk_core_news_md')
51
 
52
+ model_path = "artemkramov/coref-ua"
53
+ model = FCoref(model_name_or_path=model_path, device='cuda:0', nlp=nlp)
54
 
55
+ preds = model.predict(
56
+ texts=["""Мій друг дав мені свою машину та ключі до неї; крім того, він дав мені його книгу. Я з радістю її читаю."""]
57
+ )
58
 
59
+ preds[0].get_clusters(as_strings=False)
60
+ > [[(0, 3), (13, 17), (66, 70), (83, 84)],
61
+ [(0, 8), (18, 22), (58, 61), (71, 75)],
62
+ [(18, 29), (42, 45)],
63
+ [(71, 81), (95, 97)]]
 
 
 
 
64
 
65
+ preds[0].get_clusters()
66
+ > [['Мій', 'мені', 'мені', 'Я'], ['Мій друг', 'свою', 'він', 'його'], ['свою машину', 'неї'], ['його книгу', 'її']]
67
 
68
+ preds[0].get_logit(
69
+ span_i=(13, 17), span_j=(42, 45)
70
+ )
71
 
72
+ > -6.867196
73
+ ```
74
 
75
+ ## Training Details
76
 
77
+ ### Training Data
78
 
79
+ The model was trained on the silver coreference resolution dataset: https://huggingface.co/datasets/artemkramov/coreference-dataset-ua.
80
 
81
  ## Evaluation
82
 
83
  <!-- This section describes the evaluation protocols and provides the results. -->
84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  #### Metrics
86
 
87
+ Two types of metrics were considered: mention-based and the coreference resolution metrics themselves.
88
 
89
+ Mention-based metrics:
90
+ - mention precision
91
+ - mention recall
92
+ - mention F1
93
+
94
+ Coreference resolution metrics were calculated as the average values across the following metrics: MUC, BCubed, CEAFE:
95
+ - coreference precision
96
+ - coreference recall
97
+ - coreference F1
98
 
99
  ### Results
100
 
101
+ The metrics for the validation dataset:
102
+
103
+ | Metric | Value |
104
+ |:---------------------:|-------|
105
+ | Mention precision | 0.850 |
106
+ | Mention recall | 0.798 |
107
+ | Mention F1 | 0.824 |
108
+ | Coreference precision | 0.758 |
109
+ | Coreference recall | 0.706 |
110
+ | Coreference F1 | 0.731 |
111
 
112
  #### Summary
113