Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
metrics:
|
5 |
+
- accuracy
|
6 |
+
- precision
|
7 |
+
- f1
|
8 |
+
- recall
|
9 |
+
pipeline_tag: token-classification
|
10 |
+
library_name: spacy
|
11 |
+
tags:
|
12 |
+
- spacy
|
13 |
+
- nlp
|
14 |
+
- python
|
15 |
+
- skill-extraction
|
16 |
+
- ner
|
17 |
+
---
|
18 |
+
|
19 |
+
# Skill Extraction Model using spaCy
|
20 |
+
|
21 |
+
This is a custom **Named Entity Recognition (NER)** model built with **spaCy** to identify and extract skills from resumes and job descriptions.
|
22 |
+
|
23 |
+
## Why This Model?
|
24 |
+
|
25 |
+
To improve flexibility and accuracy, we transitioned from a static skill extraction approach to a dynamic one. This new method leverages spaCy to fine-tune a pre-trained Named Entity Recognition (NER) model, enabling the extraction of skills directly from resumes and job descriptions. By removing the dependency on predefined skill lists, the model can recognize context-specific, domain-relevant, and even newly emerging skills. This dynamic strategy offers a more adaptive and scalable solution for real-world skill extraction and talent-matching applications.
|
26 |
+
|
27 |
+
---
|
28 |
+
|
29 |
+
## How to Use
|
30 |
+
|
31 |
+
### 1. Load the Model from Hugging Face
|
32 |
+
|
33 |
+
```python
|
34 |
+
from huggingface_hub import snapshot_download
|
35 |
+
import spacy
|
36 |
+
|
37 |
+
# Download the model from the Hub
|
38 |
+
model_path = snapshot_download("amjad-awad/skill-extractor", repo_type="model")
|
39 |
+
|
40 |
+
# Load the model with spaCy
|
41 |
+
nlp = spacy.load(model_path)
|
42 |
+
|
43 |
+
# Example usage
|
44 |
+
text = "Experienced in Python, JavaScript, and cloud services like AWS and Azure."
|
45 |
+
doc = nlp(text)
|
46 |
+
|
47 |
+
# Extract skill entities
|
48 |
+
skills = [ent.text for ent in doc.ents if "SKILLS" in ent.label_]
|
49 |
+
print(skills)
|
50 |
+
```
|
51 |
+
['Python', 'JavaScript', 'cloud', 'AWS', 'Azure']
|
52 |
+
```
|