kistepAI/SPARK-Summarization-GGUF

1. Description

SPARK-Summarization is a large language model developed by the Korea Institute of S&T Evaluation and Planning (KISTEP). This model specializes in summarization tasks and utilizes Chain of Density (CoD) reasoning to provide high-quality, condensed summaries in both Korean and English.

2. Key Features

Enhanced Summarization through CoD: Delivers high-quality summaries using the Chain of Density approach, ensuring comprehensive yet concise output.
Multilingual Support: Capable of processing and generating summaries in both Korean and English.
Structured Output: Provides summaries in a bullet-point format for improved readability and quick comprehension.
Base Model: Built on Mistral-nemo as the foundation model
Training Method: Trained with Supervised Fine-Tuning (SFT)
Context Length: The maximum context length for training data is 16,384.

3. Data

source	KISTEP Documents
count	24,417

4. Usage

When using ollama, you can utilize the Modelfile.
Recommended Prompt Template (input: {TITLE}, {DOCUMENT})

propmt_template: |
    당신은 요약 전문가입니다. 주어진 텍스트를 참고하여 요약을 작성하세요.
    
    ## 요약 단계:
    1. 텍스트 분석:
        - 문서 제목과 텍스트를 주의 깊게 읽고, 문서의 주요 주제를 파악하세요.
    2. 주요 주장(key_argument) 식별:
        - 다음 질문에 답변하기: "이 텍스트의 주요 주장 또는 핵심 논점은 무엇인가?"
    3. 주요 개체(entities) 추출: 
        - 5단어 이하의 주요 개체 3개를 뽑아주세요.
    4. 요약문의 주제(title) 생성: 
        - 제공된 텍스트에 대한 간결한 한문장의 주제를 생성하세요.
    5. 요약(summary) 작성: 
        - 주요 주장과 주요 개체, 주제를 참고하여 텍스트의 주요 내용을 요약하세요.
        
    ## 향상 단계
    6. 밀도 향상:
        - 초기 요약에 포함되지 않은 1~3개의 추가 설명 개체를 식별하세요.
        - 이전 및 새 개체를 모두 통합하여 요약의 밀도가 높은 버전을 작성하세요.
    7. 중요도 평가:
        - 이전 요약에서 필수적인 부분을 강조하고 덜 중요한 부분을 줄여서 수정하세요.
        - 새 요약이 주요 주장과 밀접하게 일치하는지 확인하세요.
    8. 유창성 향상:
        - 문법, 단어 선택, 표현을 다듬어 가독성과 자연스러운 흐름을 향상시키세요.
        - 요약 세부내용의 정확성과 완전성을 유지하면서 문장 구조를 개선하세요.
    
    ## 작성 방식:
        - 문서를 소개하는 대신 요약 내용만 작성하세요.
        - 구체적인 데이터나 수치보다는 전체 흐름과 방향을 설명하세요.
        - 주어진 내용에만 기반해 객관적으로 작성하세요.
        - 한국어로 작성하되, 영어 기술 용어와 고유 명사는 그대로 사용하세요.
    
    
    ## 입력:
    ### 문서 제목:
    {TITLE}
    ### 텍스트:
    {DOCUMENT}
    ## 출력 형식:
    <reason>
    초기 주요 주장: [초기 주요 주장]
    초기 주요 개체: [초기 주요 개체 목록]
    초기 제목: [초기 제목]
    초기 요약: [초기 요약 내용]
    
    밀도 향상 단계:
    새로 추가된 주요 개체: [새로 추가된 주요 개체 목록(with bullet points)]
    사고 과정: [주요 개체 선택 및 요약 작성에 대한 설명]
    업데이트 제목: [업데이트 제목]
    업데이트 요약: [업데이트 요약 내용]
    
    중요도 평가 단계:
    사고 과정: [요약 관련성 향상을 위한 중요도 평가 및 변경된 사항에 대한 설명]
    업데이트 제목: [업데이트 제목]
    업데이트 요약: [업데이트 요약 내용]
    
    언어 유청성 단계:
    사고 과정: [언어 명확성과 유창성을 개선하기 위해 변경된 사항에 대한 설명]
    업데이트 제목: [업데이트 제목]
    Updated Summary: [요약의 각 문장 목록(with bullet points)]
    </reason>
    
    <output>
        <key_argument>[주요 주장(한국어)]</key_argument>
        <entities>[주요 개체 목록, 쉼표로 구분]</entities>
        <title>[주제(한국어)]</title>
        <summary>
            <point>[첫번째 요약 문장(한국어)]</point>
            <point>[두번째 요약 문장(한국어)]</point>
            ...
        </summary>
    </output>

5. Benchmark

TBD

kistepAI
/

SPARK-Summarization-GGUF

You need to agree to share your contact information to access this model

1. Description

2. Key Features

3. Data

4. Usage

5. Benchmark

Model tree for kistepAI/SPARK-Summarization-GGUF