35 lines
1.3 KiB
Markdown
35 lines
1.3 KiB
Markdown
# Keyword Extractor
|
|
|
|
This source is a script for extracting keywords from text using local LLM such as llama based on user prompts.
|
|
|
|
## How it works
|
|
The script processes input text and extracts the most relevant keywords using a large language model(llm) and system and user prompts which are embedded in the source code.
|
|
|
|
## Requirements
|
|
- Python 3.8+
|
|
- NLP libraries (transformers, torch, etc.)
|
|
- Other utilities as listed in the requirements file
|
|
|
|
For exact versions of the libraries, please check the **`requirements.txt`** file.
|
|
|
|
## Usage
|
|
1. Clone the repository.
|
|
2. Install dependencies:
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
3. Run the script:
|
|
```bash
|
|
python keyword_extractor.py
|
|
```
|
|
|
|
## Main Methods
|
|
- `load_model()`: Loads the pre-trained transformer model for text processing. This is the main method for model initialization.
|
|
- `preprocess_text(text)`: Cleans and prepares the input text (e.g., lowercasing, removing stopwords, etc.).
|
|
- `extract_keywords(text, top_n=10)`: The core method that applies the model and retrieves the top keywords from the input text.
|
|
- `display_results(keywords)`: Prints or saves the extracted keywords for further use.
|
|
|
|
## Model
|
|
The script uses a LLM such as llama3.1-8B for keyword extraction. The exact model can be changed in the code if needed.
|
|
|