Dacon-Encrypted-Kobart-Base-v2

난독화된 한글 리뷰 복원 AI 경진대회

Overview

This model is developed to restore obfuscated Korean reviews to their original, human-readable form. The algorithm is designed to reverse the process of encryption or distortion of Korean text, providing a clear and interpretable version of the input text.

Model Purpose

The model is trained to handle cases of text obfuscation, primarily for Korean reviews that are purposely made difficult to read. It aims to restore these reviews back to their readable and understandable format for use in various natural language processing (NLP) tasks.

Description

Task: De-obfuscation of Korean text
Objective: Convert difficult-to-read, obfuscated Korean reviews back into their original forms.
Data: This model is based on a custom dataset from the Dacon competition, containing encrypted Korean reviews.

Intended Use

This model can be applied in scenarios where Korean text has been distorted or encrypted for privacy, security, or other purposes and needs to be restored to its original state for analysis or review. Example use cases include:

Restoring reviews from encrypted forms
Processing distorted data for sentiment analysis, opinion mining, and other NLP tasks

Model Details

Base Model: Kobart-Base
Architecture: Encoder-Decoder (BART)
Training Data: A dataset from the Dacon Encrypted Korean Reviews competition.

How to Use

Example:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("otter35/Dacon-Encrypted-kobart-base-v2")
tokenizer = AutoTokenizer.from_pretrained("otter35/Dacon-Encrypted-kobart-base-v2")

# Input obfuscated review text
text = "야... 칵컥 좋꾜 부됴"

# Preprocessing and inference
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True)
outputs = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"])
decoded_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("Decoded Review:", decoded_text)

otter35
/

Dacon-Encrypted-kobart-base-v2