96 lines
6.7 KiB
Plaintext
96 lines
6.7 KiB
Plaintext
orgcatorg/xlm-v-base-ner
|
||
##################################################
|
||
##################################################
|
||
2024-07-18 16:35:12.660757: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
|
||
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
|
||
2024-07-18 16:35:15,455 Reading data from data
|
||
2024-07-18 16:35:15,456 Train: data/peyma_train.txt
|
||
2024-07-18 16:35:15,456 Dev: None
|
||
2024-07-18 16:35:15,456 Test: None
|
||
2024-07-18 16:35:17,860 No test split found. Using 0% (i.e. 803 samples) of the train split as test data
|
||
2024-07-18 16:35:17,865 No dev split found. Using 0% (i.e. 722 samples) of the train split as dev data
|
||
2024-07-18 16:35:17,865 Computing label dictionary. Progress:
|
||
|
||
0it [00:00, ?it/s]
|
||
1it [00:00, 2262.30it/s]
|
||
|
||
0it [00:00, ?it/s]
|
||
3503it [00:00, 35026.69it/s]
|
||
6503it [00:00, 35617.87it/s]
|
||
2024-07-18 16:35:18,051 Dictionary created for label 'ner' with 1072 values: O (seen 185595 times), های|O (seen 2277 times), ها|O (seen 1045 times), ای|O (seen 611 times), شود|O (seen 515 times), اند|O (seen 277 times), کند|O (seen 273 times), کنند|O (seen 183 times), هایی|O (seen 152 times), تواند|O (seen 124 times), ترین|O (seen 105 times), گذاری|O (seen 100 times), دهد|O (seen 100 times), جمله|O (seen 95 times), طور|O (seen 90 times), که|O (seen 87 times), تر|O (seen 82 times), شوند|O (seen 80 times), کنیم|O (seen 69 times), توان|O (seen 68 times)
|
||
model read successfully !
|
||
##################################################
|
||
##################################################
|
||
2024-07-18 16:35:22,095 SequenceTagger predicts: Dictionary with 1072 tags: O, های|O, ها|O, ای|O, شود|O, اند|O, کند|O, کنند|O, هایی|O, تواند|O, ترین|O, گذاری|O, دهد|O, جمله|O, طور|O, که|O, تر|O, شوند|O, کنیم|O, توان|O, نام|O, رود|O, المللی|O, الله|O, سازی|O, کننده|O, گیری|O, گیرد|O, ی|O, وگو|O, توانند|O, ایم|O, ماه|I_DAT, دهند|O, کنم|O, اش|O, و, ریزی|O, های|I_ORG, رسد|O, زیست|O, شد|O, نامه|O, گوید|O, بینی|O, شان|O, از|O, خاطر|O, را|O, رسانی|O
|
||
2024-07-18 16:35:22,107 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,108 Model: "SequenceTagger(
|
||
(embeddings): TransformerWordEmbeddings(
|
||
(model): XLMRobertaModel(
|
||
(embeddings): XLMRobertaEmbeddings(
|
||
(word_embeddings): Embedding(901630, 768)
|
||
(position_embeddings): Embedding(514, 768, padding_idx=1)
|
||
(token_type_embeddings): Embedding(1, 768)
|
||
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
|
||
(dropout): Dropout(p=0.1, inplace=False)
|
||
)
|
||
(encoder): XLMRobertaEncoder(
|
||
(layer): ModuleList(
|
||
(0-11): 12 x XLMRobertaLayer(
|
||
(attention): XLMRobertaAttention(
|
||
(self): XLMRobertaSelfAttention(
|
||
(query): Linear(in_features=768, out_features=768, bias=True)
|
||
(key): Linear(in_features=768, out_features=768, bias=True)
|
||
(value): Linear(in_features=768, out_features=768, bias=True)
|
||
(dropout): Dropout(p=0.1, inplace=False)
|
||
)
|
||
(output): XLMRobertaSelfOutput(
|
||
(dense): Linear(in_features=768, out_features=768, bias=True)
|
||
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
|
||
(dropout): Dropout(p=0.1, inplace=False)
|
||
)
|
||
)
|
||
(intermediate): XLMRobertaIntermediate(
|
||
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
||
(intermediate_act_fn): GELUActivation()
|
||
)
|
||
(output): XLMRobertaOutput(
|
||
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
||
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
|
||
(dropout): Dropout(p=0.1, inplace=False)
|
||
)
|
||
)
|
||
)
|
||
)
|
||
(pooler): XLMRobertaPooler(
|
||
(dense): Linear(in_features=768, out_features=768, bias=True)
|
||
(activation): Tanh()
|
||
)
|
||
)
|
||
)
|
||
(locked_dropout): LockedDropout(p=0.5)
|
||
(linear): Linear(in_features=768, out_features=1072, bias=True)
|
||
(loss_function): CrossEntropyLoss()
|
||
)"
|
||
2024-07-18 16:35:22,108 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,108 Corpus: 6503 train + 722 dev + 803 test sentences
|
||
2024-07-18 16:35:22,108 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,108 Train: 6503 sentences
|
||
2024-07-18 16:35:22,108 (train_with_dev=False, train_with_test=False)
|
||
2024-07-18 16:35:22,108 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,108 Training Params:
|
||
2024-07-18 16:35:22,108 - learning_rate: "4e-05"
|
||
2024-07-18 16:35:22,108 - mini_batch_size: "10"
|
||
2024-07-18 16:35:22,108 - max_epochs: "200"
|
||
2024-07-18 16:35:22,109 - shuffle: "True"
|
||
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,109 Plugins:
|
||
2024-07-18 16:35:22,109 - LinearScheduler | warmup_fraction: '0.1'
|
||
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,109 Final evaluation on model after last epoch (final-model.pt)
|
||
2024-07-18 16:35:22,109 - metric: "('micro avg', 'f1-score')"
|
||
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,109 Computation:
|
||
2024-07-18 16:35:22,109 - compute on device: cuda:0
|
||
2024-07-18 16:35:22,109 - embedding storage: none
|
||
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,109 Model training base path: "taggers"
|
||
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
|
||
2024-07-18 16:35:22,109 ----------------------------------------------------------------------------------------------------
|
||
333333333
|
||
The expanded size of the tensor (573) must match the existing size (514) at non-singleton dimension 1. Target sizes: [10, 573]. Tensor sizes: [1, 514]
|