codechrl
/

bert-micro-cybersecurity

@@ -1,55 +1,53 @@
 ---
-language:
-- en
-- id
 tags:
-- text-classification
-- cybersecurity
-base_model: boltuix/bert-micro
 ---
-# Model Card for “bert-micro-cybersecurity”
-## 1. Model Details
-**Model description**
-“bert-micro-cybersecurity” is a compact transformer model derived from `boltuix/bert-micro`, adapted for cybersecurity text classification tasks (e.g., threat detection, incident reports, malicious vs benign content).
-- Model type: fine-tuned lightweight BERT variant
-- Languages: English & Indonesia
-- Finetuned from: `boltuix/bert-micro`
-- Status: **Early version** — trained on ~ **2%** of planned data.
-**Model sources**
-- Base model: [boltuix/bert-micro](https://huggingface.co/boltuix/bert-micro) :contentReference[oaicite:3]{index=3}
-- Data: Cybersecurity Data
-## 2. Uses
-### Direct use
-You can use this model to classify cybersecurity-related text — for example, whether a given message, report or log entry indicates malicious intent, abnormal behaviour, or threat presence.
-### Downstream use
-- Embedding extraction for clustering or anomaly detection in security logs.
-- As part of a pipeline for phishing detection, malicious email filtering, incident triage.
-- As a feature extractor feeding a downstream system (e.g., alert-generation, SOC dashboard).
-### Out-of-scope use
-- Not meant for high-stakes automated blocking decisions without human review.
-- Not optimized for languages other than English.
-- Not tested for non-cybersecurity domains or out-of-distribution data.
-## 3. Bias, Risks, and Limitations
-Because the model is based on a very small subset (~ 2%) of planned data, performance is preliminary and may degrade on unseen or specialized domains (industrial control, IoT logs, foreign language).
-- Inherits any biases present in the base model (`boltuix/bert-micro`) and in the fine-tuning data — e.g., over-representation of certain threat types, vendor or tooling-specific vocabulary. :contentReference[oaicite:4]{index=4}
-- Should not be used as sole authority for incident decisions; only as an aid to human analysts.
-## 4. How to Get Started with the Model
-```python
-from transformers import AutoTokenizer, AutoModelForSequenceClassification
-tokenizer = AutoTokenizer.from_pretrained("your-username/bert-micro-cybersecurity")
-model = AutoModelForSequenceClassification.from_pretrained("your-username/bert-micro-cybersecurity")
-inputs = tokenizer("The server logged an unusual outbound connection to 123.123.123.123", return_tensors="pt", truncation=True, padding=True)
-outputs = model(**inputs)
-logits = outputs.logits
-predicted_class = logits.argmax(dim=-1).item()

 ---
+library_name: transformers
+base_model: codechrl/bert-micro-cybersecurity
 tags:
+- generated_from_trainer
+model-index:
+- name: bert-micro-cybersecurity
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# bert-micro-cybersecurity
+This model is a fine-tuned version of [codechrl/bert-micro-cybersecurity](https://huggingface.co/codechrl/bert-micro-cybersecurity) on the None dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.06
+- num_epochs: 3
+### Training results
+### Framework versions
+- Transformers 4.57.0
+- Pytorch 2.8.0+cu128
+- Datasets 4.2.0
+- Tokenizers 0.22.1

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:873dd9ce370e9fd561cba1d3c713e806e31c4b54585edf06cf940b25a5a33bed
 size 17671560

 version https://git-lfs.github.com/spec/v1
+oid sha256:771d455ace0a6fd55c4819322b1388568b18b10da0527b804a55eaa36d811c01
 size 17671560

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e7bb591ef4b3eb244b42ddcabd8e4d0914fb1e5705cfabb1514c3b55118856a4
 size 5841

 version https://git-lfs.github.com/spec/v1
+oid sha256:084656c8a1292904b005795f8f0538fe567f1dc4776d4e210711e623b4041f7b
 size 5841