Model Card for dataeaze/RegLLM-v2-ChecklistPointsExtractor-Llama-3.2-3B
This model extracts audit checklist points from regulatory documents.
Model Details
Model Description
Audit and compliance teams in most enterprises need to read and comprehend a huge number of regulatory documents. To ensure compliance the usual procedure is to create audit checklists from the applicable regulations. This model has been fine tuned to automate the extraction of audit checklist points. This small language model powers the complieaze.ai - agentic Gen AI application for regulatory compliance.
- Developed by: dataeaze systems pvt ltd
- Model type: LlamaForCausalLM
- Language(s) (NLP): English
- License: llama3.2
- Finetuned from model: meta-llama/Llama-3.2-3B-Instruct
Uses
Direct Use
The model can be used to extract audit checklist points from regulatory documents like acts, rules, regulations, circulars, master circulars, master directions.
Downstream Use
The model is a part of the regulatory compliance application complieaze.ai developed by dataeaze systems.
Out-of-Scope Use
The model is not intended to be used for information security compliance related tasks.
Bias, Risks, and Limitations
- The model might not work well for tasks other than extracting audit checklists.
- The model has been fine tuned and tested on documents from Indian regulators like RBI, SEBI etc. There could be a bias towards the structure of Indian regulatory documents.
Recommendations
For most users we recommend using the model through complieaze.ai application. Only the AI experts should attempt to use the model directly.
How to Get Started with the Model
Use the code below to get started with the model.
import json
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import pipeline
model = AutoModelForCausalLM.from_pretrained("dataeaze/RegLLM-v2-ChecklistPointsExtractor-Llama-3.2-3B", device_map = "auto", trust_remote_code = True)
tokenizer = AutoTokenizer.from_pretrained("dataeaze/RegLLM-v2-ChecklistPointsExtractor-Llama-3.2-3B")
system_prompt = "You are an expert at creating an auditor's checklist from a given text of a regulatory compliance document"
user_prompt = """
### TEXT:
{document_text}"""
document_text = (
"(1)The CICs shall share with the CIs, the logic and "
"validation processes involved in data acceptance, so that instances of data"
"rejection are minimised. The reasons for rejection shall be parameterised by"
"the CICs and circulated among the concerned CIs."
"\n"
"(2)Rejection reports issued by CICs shall be simple and understandable so that"
"they can be used for fixing reporting and data level issues by the CIs."
"\n"
"(3)CIs shall rectify the rejected data and upload the same with the CICs within"
"seven days of receipt of such rejection report."
"\n"
"(4)CICs shall undertake periodic exercises/checks, at least once in a quarter,"
"to identify identifier inconsistencies in its database and share the findings"
"of such identifier inconsistencies with the respective CIs for confirming the"
"accuracy of the same. The list of CIs who do not respond in timely manner (say within a month)"
"to such data cleansing exercises shall be sent to Department of Supervision,"
"Central Office at half yearly intervals (as on March 31 and September 30) for"
"information."
)
user_prompt = user_prompt.format(document_text = document_text)
messages = [
{"role" : "system", "content" : system_prompt},
{"role" : "user", "content": user_prompt}
]
pipe = pipeline("text-generation", model = model, tokenizer = tokenizer, temperature = 0.01, max_new_tokens = 4096)
out = pipe(messages)[0]['generated_text'][-1]
out_dic = json.loads(out['content'])
checklist = out_dic['checklist']
for i in checklist:
print(i['action_item'])
print("------------")
Evaluation
Testing Data, Factors & Metrics
Testing Data
Evaluation dataset consists of 45 sections out of which 38 have checklist points and 7 don't. There are a total of 212 audit checklist points (and 5 empty audit cheklist points). The sections are extracted and randomly sampled from dataset of regulators RBI, SEBI, IRDAI, NPCI, NFRA, IBBI, FSDC.
Factors
Our gold answer is from DeepSeek-V3 model. We use LLM-as-a-Judge to compare predicted responses with the gold answers DeepSeek-v3. LLM-as-a-Judge is based on GPT-4o. The LLM-as-a-Judge evaluations closely matched the human judgements from domain experts on a dataset of 74 sections and 610 checklist points.
Metrics
We calculate precision, recall and f1 score
To calculate these we compare two answers from two LLMs one of which is considered gold standard (DeepSeek-v3 in our case)
- true_positives: checklist point in gold answer also captured in generated
- false_positives: gold checklist for a section is empty, but we have generated checklist items
- false_negatives: checklist items in gold not captured in generated
- true_negatives: empty gold checklist and generated checklist also is empty
These comparisons are based on human judgements and imitated by the LLM-as-a-Judge
Results
Metric | RegLLM | GPT-4o 2024-05-13 |
---|---|---|
True Positives (TP) | 187 | 191 |
False Positives (FP) | 0 | 0 |
False Negatives (FN) | 25 | 21 |
True Negatives (TN) | 5 | 5 |
Precision | 1.0 | 1.0 |
Recall | 0.882 | 0.9 |
F1 Score | 0.937 | 0.947 |
Throughput (tokens/sec) | 223 1 | 90.2 2 |
Cost - Input ($ per 1M tokens) | 0.534 3 | 5 4 |
Cost - Output ($ per 1M tokens) | 0.534 3 | 15 4 |
References
2. https://artificialanalysis.ai/models/gpt-4o-2024-05-13
3. Estimated based upon L40S pricing of $0.86 per hour of L40S GPU on vast.ai
4. https://platform.openai.com/docs/pricing#other-models
Summary
RegLLM offers a cost effective model for extraction of audit checklist points. It is 18.73x cheaper, 2.47x faster than GPT-4o with a 1 per cent impact on the quality of results.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: Nvidia L40S
- Hours used: 10
- Cloud Provider: vast.ai
- Compute Region: Quebec, Canada
- Carbon Emitted: 0.21 kg CO2 eq.
Model Card Authors
- Atharva Inamdar [email protected]
- Tony Tom [email protected]
- Suyash Chougule [email protected]
- Saurabh Daptardar [email protected]
Model Card Contact
Saurabh Daptardar [email protected]
- Downloads last month
- 11
Model tree for dataeaze/RegLLM-v2-ChecklistPointsExtractor-Llama-3.2-3B
Base model
meta-llama/Llama-3.2-3B-Instruct