data_processes/readme/readme-representer-en.md

# Persian Sentence Representation Script

This script (`p5_representer.py`) is designed to simplify and represent complex Persian legal sentences as a set of simpler, more understandable sentences. It uses the `meta-llama/Meta-Llama-3.1-8B-Instruct` model for this task.

**Note:** For library versions, please refer to the `requirements.txt` file.

## Model Used

- Model: `meta-llama/Meta-Llama-3.1-8B-Instruct`
- Loaded via HuggingFace Transformers (`AutoModelForCausalLM`, `AutoTokenizer`)

## System and User Prompts

- **System prompt:** Sets the model as a legal expert who explains legal texts in simple language for non-experts, without changing technical terms.
- **User prompt:** Asks the model to rewrite the input legal text in a specified number of simple sentences in Persian.

## Main Methods

### 1. `single_section_representation(content)`
- **Purpose:** Simplifies a single legal text section.
- **Inputs:**
  - `content` (str): The legal text to be simplified.
- **Outputs:**
  - `result` (bool): Operation status.
  - `desc` (str): Description of the result.
  - `sentences` (list): List of simplified sentences.

### 2. `do_representation(sections)`
- **Purpose:** Processes multiple sections and saves the results.
- **Inputs:**
  - `sections` (dict): Dictionary where each key is a section ID and each value contains a `content` field.
- **Outputs:**
  - `operation_result` (bool): Overall operation status.
  - `sections` (dict): The input dictionary with an added `represented_sentences` field for each section.

## Example Input

```python
sections = {
    "1": {"content": "این یک متن حقوقی پیچیده است که باید ساده شود."},
    "2": {"content": "متن حقوقی دوم برای بازنمایی."}
}
result, output_sections = do_representation(sections)
```

## Output

Each section will have a new field `represented_sentences` containing the simplified sentences.

## Notes

- The script automatically uses GPU if available.
- Errors for each section are logged in the `./data/represent/` directory.
- The output JSON file is saved in `./data/represent/`.