Flair_NER/peyma_dataset_14030427.log

96 lines
6.7 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

orgcatorg/xlm-v-base-ner
##################################################
##################################################
2024-07-18 16:35:12.660757: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-18 16:35:15,455 Reading data from data
2024-07-18 16:35:15,456 Train: data/peyma_train.txt
2024-07-18 16:35:15,456 Dev: None
2024-07-18 16:35:15,456 Test: None
2024-07-18 16:35:17,860 No test split found. Using 0% (i.e. 803 samples) of the train split as test data
2024-07-18 16:35:17,865 No dev split found. Using 0% (i.e. 722 samples) of the train split as dev data
2024-07-18 16:35:17,865 Computing label dictionary. Progress:
0it [00:00, ?it/s]
1it [00:00, 2262.30it/s]
0it [00:00, ?it/s]
3503it [00:00, 35026.69it/s]
6503it [00:00, 35617.87it/s]
2024-07-18 16:35:18,051 Dictionary created for label 'ner' with 1072 values: O (seen 185595 times), های|O (seen 2277 times), ها|O (seen 1045 times), ای|O (seen 611 times), شود|O (seen 515 times), اند|O (seen 277 times), کند|O (seen 273 times), کنند|O (seen 183 times), هایی|O (seen 152 times), تواند|O (seen 124 times), ترین|O (seen 105 times), گذاری|O (seen 100 times), دهد|O (seen 100 times), جمله|O (seen 95 times), طور|O (seen 90 times), که|O (seen 87 times), تر|O (seen 82 times), شوند|O (seen 80 times), کنیم|O (seen 69 times), توان|O (seen 68 times)
model read successfully !
##################################################
##################################################
2024-07-18 16:35:22,095 SequenceTagger predicts: Dictionary with 1072 tags: O, های|O, ها|O, ای|O, شود|O, اند|O, کند|O, کنند|O, هایی|O, تواند|O, ترین|O, گذاری|O, دهد|O, جمله|O, طور|O, که|O, تر|O, شوند|O, کنیم|O, توان|O, نام|O, رود|O, المللی|O, الله|O, سازی|O, کننده|O, گیری|O, گیرد|O, ی|O, وگو|O, توانند|O, ایم|O, ماه|I_DAT, دهند|O, کنم|O, اش|O, و, ریزی|O, های|I_ORG, رسد|O, زیست|O, شد|O, نامه|O, گوید|O, بینی|O, شان|O, از|O, خاطر|O, را|O, رسانی|O
2024-07-18 16:35:22,107 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,108 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): XLMRobertaModel(
(embeddings): XLMRobertaEmbeddings(
(word_embeddings): Embedding(901630, 768)
(position_embeddings): Embedding(514, 768, padding_idx=1)
(token_type_embeddings): Embedding(1, 768)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): XLMRobertaEncoder(
(layer): ModuleList(
(0-11): 12 x XLMRobertaLayer(
(attention): XLMRobertaAttention(
(self): XLMRobertaSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): XLMRobertaSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): XLMRobertaIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): XLMRobertaOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): XLMRobertaPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=1072, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-07-18 16:35:22,108 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,108 Corpus: 6503 train + 722 dev + 803 test sentences
2024-07-18 16:35:22,108 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,108 Train: 6503 sentences
2024-07-18 16:35:22,108 (train_with_dev=False, train_with_test=False)
2024-07-18 16:35:22,108 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,108 Training Params:
2024-07-18 16:35:22,108 - learning_rate: "4e-05"
2024-07-18 16:35:22,108 - mini_batch_size: "10"
2024-07-18 16:35:22,108 - max_epochs: "200"
2024-07-18 16:35:22,109 - shuffle: "True"
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,109 Plugins:
2024-07-18 16:35:22,109 - LinearScheduler | warmup_fraction: '0.1'
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,109 Final evaluation on model after last epoch (final-model.pt)
2024-07-18 16:35:22,109 - metric: "('micro avg', 'f1-score')"
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,109 Computation:
2024-07-18 16:35:22,109 - compute on device: cuda:0
2024-07-18 16:35:22,109 - embedding storage: none
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,109 Model training base path: "taggers"
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
333333333
The expanded size of the tensor (573) must match the existing size (514) at non-singleton dimension 1. Target sizes: [10, 573]. Tensor sizes: [1, 514]