Skip to content

Conversation

@a0x8o
Copy link

@a0x8o a0x8o commented Jul 13, 2023

Description

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Code improvements with no or little impact
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING page.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

DevinTDHa and others added 30 commits March 14, 2025 17:26
…hi-3.5-Vision

Sparknlp 1060 implement phi 3.5 vision
…LaVA-and-LLaVA-NeXT

SparkNLP 1033: Introducing LLAVA
…g-new-Qwen2-VL-models

SparkNLP 1077- Introducing Qwen2 - VL
…ing-support-to-read-Excel-files

[SPARKNLP-1102] Adding support to read Excel files
…ing-support-to-read-PowerPoint-files

[SPARKNLP-1103] Adding support to read power point files
…lement-ForMultipleChoice-for-ALBERT

[SPARKNLP-1105] Introducing AlbertForMultipleChoice Transformer
…06-Implement-ForMultipleChoice-for-DistilBERT
…lement-ForMultipleChoice-for-DistilBERT

[SPARKNLP-1106] Introducing DistilBertForMultipleChoice Transformer
…lement-ForMultipeChoice-for-RoBERTa

[SPARKNLP-1107] Introducing RoBertaForMultipleChoice
…08-Implement-ForMultipleChoice-for-XLMRoBERTa
…lement-ForMultipleChoice-for-XLMRoBERTa

[SPARKNLP-1108] Introducing XlmRoBertaForMultipleChoice Transformer
…ing-a-PDF-Reader-to-Spark-NLP

[SPARKNLP-1098] Adding PDF reader support
…lama-3.2-Vision-models

Sparknlp 1078 Introducing llama 3.2 vision models
…UFVisionModel

[SPARKNLP-1079] AutoGGUFVisionModel
* [SPARKNLP-1109] Adding Extractor annotator

* [SPARKNLP-1109] Adding Cleaner annotator

* [SPARKNLP-1109] Adding missing index parameter in python

* [SPARKNLP-1109] Adding right inheritance for Cleaner in python

* [SPARKNLP-1109] Adding notebooks demo for Cleaner and Extractor

* [SPARKNLP-1110] Adding notebook demo for Email reader and Cleaner
…ing-support-to-enhance-read-TXT-files

[SPARKNLP-1113] Adding Text Reader
DevinTDHa and others added 30 commits October 22, 2025 14:39
The script now checks for Java and installs OpenJDK 11 if not present. JAVA_HOME and PATH are also set to ensure Java is available for subsequent steps.
- Fill output array in place to reduce RAM usage
- use sortWithinPartitions instead of a custom map over partitions to
  not materialize rows
* Reader2Doc new defaults to always output single document

* XMLReader improvements

- doesn't output empty text anymore
- Can extract tag attribute values

* Reader2Doc improvements

- adjusted defaults, so we always output a single large document
- can specify join char with new parameter
- adjusted other readers for new defaults

* Reader2Doc improvements python side

* ReaderAssembler: Fix failing test
* Add model 2025-04-09-sent_arabic_monomodel_monotok_en

* Add model 2025-04-09-sent_schwurpert_pipeline_de

* Add model 2025-04-08-wav2vec2_large_xls_r_300m_hindi_devendr_en

* Add model 2025-04-08-dialogpt_medium_harry_pipeline_en

* Add model 2025-04-09-gpt_2_finetuning_airaid_en

* Add model 2025-04-08-mchammer_pipeline_en

* Add model 2025-04-09-wav2vec2_large_xls_r_300m_kor_11385_2_en

* Add model 2025-04-09-sent_bert_base_stackoverflow_comments_2m_pipeline_en

* Add model 2025-04-08-shape_nato_pipeline_en

* Add model 2025-04-09-burmese_awesome_wnut_model_ai_pipeline_en

* Add model 2025-04-09-vit_female_age_classification_en

* Add model 2025-04-09-vit_base_oxford_iiit_pets_niko132_pipeline_en

* Add model 2025-04-09-koriposting_en

* Add model 2025-04-09-rockdrigoma_pipeline_en

* Add model 2025-04-09-vit_base_patch16_224_finetuned_cedar_en

* Add model 2025-04-09-williamblakebot_pipeline_en

* Add model 2025-04-09-bert_base_train_book_ent_15p_ra_en

* Add model 2025-04-09-tinybert_train_book_ent_15p_en

* Add model 2025-04-08-exp_w2v2t_indonesian_xlsr_53_s358_id

* Add model 2025-04-08-bert_finetuned_ner_accelerate_atichets_pipeline_en

* Add model 2025-04-09-brad_buchsbaum_en

* Add model 2025-04-09-honeytech_pipeline_en

* Add model 2025-04-09-extended_gender_classifier_en

* Add model 2025-04-09-smids_1x_deit_tiny_rms_001_fold3_pipeline_en

* Add model 2025-04-09-icelynjennings_pipeline_en

* Add model 2025-04-09-jackposobiec_pipeline_en

* Add model 2025-04-09-sent_finnish_monomodel_monotok_pipeline_en

* Add model 2025-04-08-exp5_10partition_modelo_asl6000_pipeline_en

* Add model 2025-04-08-output_pipeline_pt

* Add model 2025-04-09-bert_finetuned_ner_huizhoucheng_en

* Add model 2025-04-09-icelynjennings_en

* Add model 2025-04-09-sent_tiny_mlm_glue_mnli_from_scratch_custom_tokenizer_expand_vocab_en

* Add model 2025-04-09-sent_drclips_en

* Add model 2025-04-09-sent_nbme_bio_clinicalbert_en

* Add model 2025-04-09-finetune_model_bert_en

* Add model 2025-04-09-bert_finetuned_ner_fundrais123_en

* Add model 2025-04-09-filler_username_pipeline_en

* Add model 2025-04-09-gpt2_chatbot_kuttersn_en

* Add model 2025-04-09-musebiihi_pipeline_en

* Add model 2025-04-09-disconcision_pipeline_en

* Add model 2025-04-09-arxiv_classifier_debertav3_en

* Add model 2025-04-08-wenger_en

* Add model 2025-04-08-burmese_awesome_model_recod_en

* Add model 2025-04-09-exp_w2v2t_portuguese_norwegian_pretraining_s84_pt

* Add model 2025-04-09-sent_bert_base_uncased_finetuned_mol_mlm_0_3_en

* Add model 2025-04-09-sent_tlm_rct_20k_large_scale_pipeline_en

* Add model 2025-04-08-jen_122_pipeline_en

* Add model 2025-04-09-dkulchar_pipeline_en

* Add model 2025-04-09-pico8degalaleo_pipeline_en

* Add model 2025-04-09-dialogpt_medium_captainprice_extended_en

* Add model 2025-04-09-wav2vec2_gujarati_stt_pipeline_en

* Add model 2025-04-08-smids_5x_deit_small_rms_00001_fold1_en

* Add model 2025-04-09-sent_minilm_l12_h384_uncased_finetuned_imdb_en

* Add model 2025-04-09-bert_suicide_detection_hk_large_nepal_bhasa_pipeline_en

* Add model 2025-04-09-distilbert_base_uncased_news_sentiment_finetuned_english_en

* Add model 2025-04-08-monopolyfornite_en

* Add model 2025-04-08-dialogpt_small_shy_en

* Add model 2025-04-09-distilbert_token_itr0_0_0001_editorials_01_03_2022_15_20_12_pipeline_en

* Add model 2025-04-09-kehlani_pipeline_en

* Add model 2025-04-09-burmese_awesome_humanaction_model_pipeline_en

* Add model 2025-04-09-tigers_side_vit_en

* Add model 2025-04-09-stp_classifier_13_1_en

* Add model 2025-04-08-nepali_grammar_error_detection_20250311_1323_en

* Add model 2025-04-09-mldz4shad_en

* Add model 2025-04-09-exp_w2v2t_swedish_northern_sami_xlsr_53_s328_en

* Add model 2025-04-09-bert_base_uncased_token_itr0_0_0001_train_essays_test_test_set_05_03_2022_05_58_31_en

* Add model 2025-04-09-wav2vec2_xlsr_53_marathi_large_en

* Add model 2025-04-09-hushem_5x_deit_small_adamax_0001_fold1_pipeline_en

* Add model 2025-04-09-lora_toxic_comment_pipeline_en

* Add model 2025-04-09-absa_turkish_bert_based_small_tr

* Add model 2025-04-08-smids_1x_deit_tiny_rms_001_fold5_en

* Add model 2025-04-09-wav2vec2_base_timit_demo_colab_bsen_pipeline_en

* Add model 2025-04-09-bert_base_turkish_sentiment_analysis_pipeline_tr

* Add model 2025-04-09-bert_base_turkish_sentiment_analysis_tr

* Add model 2025-04-09-bert_base_turkish_offensive_pipeline_tr

* Add model 2025-04-09-document_type_identification_en

* Add model 2025-04-09-sent_bnbert_pipeline_en

* Add model 2025-04-09-wav2vec2_large_xls_r_300m_tamil_colab_aakhilesh_en

* Add model 2025-04-08-sent_mbert_tlm_sent_english_chinese_en

* Add model 2025-04-08-pii_protection_model_pipeline_en

* Add model 2025-04-09-bert_tiny_finetuned_xglue_ner_en

* Add model 2025-04-08-wav2vec2_large_xls_r_300m_urdu_colab_pipeline_en

* Add model 2025-04-09-sent_bert_base_uncased_issues_128_xxr_pipeline_en

* Add model 2025-04-09-sent_mbert_tlm_chat_english_german_en

* Add model 2025-04-09-db_slr_1_1e_en

* Add model 2025-04-08-cher_pipeline_en

* Add model 2025-04-09-wav2vec2_base_libir_zenodo_pipeline_en

* Add model 2025-04-09-vit_epochs5_batch32_lr5e_05_size224_tiles4_seed3_q3_dropout_v2_en

* Add model 2025-04-09-wav2vec2_base_test_pipeline_en

* Add model 2025-04-09-lesseyecontact_en

* Add model 2025-04-09-wav2vec2_base_swbd_turn_eos_long_short_utt_removed_5percent_pipeline_en

* Add model 2025-04-09-micbucci_pipeline_en

* Add model 2025-04-09-veganseltzer_pipeline_en

* Add model 2025-04-08-dialogpt_medium_ff7_en

* Add model 2025-04-09-sent_storieslm_v1_1945_pipeline_en

* Add model 2025-04-09-sent_mbert_tlm_chat_english_chinese_pipeline_en

* Add model 2025-04-09-dialogpt_medium_milo_en

* Add model 2025-04-09-dataandme_en

* Add model 2025-04-09-lumetroid_en

* Add model 2025-04-09-dialogpt_medium_milo_pipeline_en

* Add model 2025-04-09-bbcqos_fitslut63_kellyg_official_en

* Add model 2025-04-09-stp_classifier_13_1_pipeline_en

* Add model 2025-04-09-vit_base_beans_demo_v5_hwooo92_pipeline_en

* Add model 2025-04-09-ridiculouscrabs_en

* Add model 2025-04-08-autotrain_20_12_2022_exam_part3_2543877946_pipeline_en

* Add model 2025-04-09-zemfira_en

* Add model 2025-04-09-michaeltrazzi_pipeline_en

* Add model 2025-04-09-absa_turkish_bert_based_small_pipeline_tr

* Add model 2025-04-09-gunna_pipeline_en

* Add model 2025-04-09-ourqueeningreen_pipeline_en

* Add model 2025-04-09-jenslennartsson_pipeline_en

* Add model 2025-04-09-sent_bottleneckbertsmall_en

* Add model 2025-04-09-dialogpt_mid_hpai_en

* Add model 2025-04-09-shelbythanna_en

* Add model 2025-04-09-macintoxic_en

* Add model 2025-04-09-square_rundi_square_rundi_second_vote_full_pic_25_age_gender_en

* Add model 2025-04-09-sent_first_try_rubert_200_16_16_25ep_en

* Add model 2025-04-09-postpostpostr_en

* Add model 2025-04-09-richardsocher_en

* Add model 2025-04-09-bert_base_german_cased_finetuned_subj_v1_pipeline_en

* Add model 2025-04-09-guggersylvain_pipeline_en

* Add model 2025-04-09-guggersylvain_en

* Add model 2025-04-09-macegrunow_en

* Add model 2025-04-09-macegrunow_pipeline_en

* Add model 2025-04-09-nueclear333_pipeline_en

* Add model 2025-04-09-olikuchi_en

* Add model 2025-04-09-wav2vec2_large_xlsr_53_full_train_full_train_pipeline_en

* Add model 2025-04-09-lanalilligant_en

* Add model 2025-04-08-peppa_pipeline_en

* Add model 2025-04-08-3_epochs_classifier_en

* Add model 2025-04-08-bert_base_greek_uncased_v1_finetuned_ner_pipeline_en

* Add model 2025-04-09-deit_base_patch16_224_rice_leaf_disease_augmented_tagalog_pipeline_en

* Add model 2025-04-08-wav2vec2_large_xlsr_estonian_m3hrdadfi_pipeline_et

* Add model 2025-04-08-sent_bert_base_uncased_multi_128_pipeline_en

* Add model 2025-04-09-mspunks_en

* Add model 2025-04-09-mspunks_pipeline_en

* Add model 2025-04-09-vit_base_patch16_224_masaratti_pipeline_en

* Add model 2025-04-09-burmese_awesome_emotion_identifier_model_en

* Add model 2025-04-09-wav2vec2_large_xls_r_300m_chichewa_colab_en

* Add model 2025-04-09-lesseyecontact_pipeline_en

* Add model 2025-04-07-dialogpt_small_rick_havokx_pipeline_en

* Add model 2025-04-08-wav2vec2_large_uralic_voxpopuli_v2_sami_parl_ext_ft_en

* Add model 2025-04-09-dnlklr_pipeline_en

* Add model 2025-04-09-wav2vec2_base_cynthia_timit_pipeline_en

* Add model 2025-04-09-mri_classifier_djibri_pipeline_en

* 2025-04-11-smolvlm_instruct_int4_en (#14550)

* Add model 2025-04-11-smolvlm_instruct_int4_en

* Add model 2025-04-14-paligemma_3b_pt_224_int4_en

* Add model 2025-04-15-paligemma_3b_ft_vqav2_448_int4_en

* Add model 2025-04-15-paligemma_3b_pt_224_int4_en

* Add model 2025-04-15-paligemma2_3b_pt_448_int4_en

* Add model 2025-04-15-paligemma2_3b_mix_224_int4_en

* Add model 2025-04-28-gemma_3_4b_it_int4_en

* Add model 2025-04-28-gemma_3_4b_pt_int4_en

---------

Co-authored-by: prabod <[email protected]>

* 2025-05-16-internvl2_1b_int4_en (#14577)

* Add model 2025-05-16-internvl2_1b_int4_en

* Add model 2025-05-16-internvl2_5_1b_int4_en

* Add model 2025-05-16-internvl3_1b_int4_en

* Add model 2025-05-16-internvl3_2b_int4_en

* Add model 2025-05-16-internvl3_8b_int4_en

* Add model 2025-05-16-internvl2_5_4b_int4_en

* Add model 2025-05-27-florence_2_base_ft_int4_en

* Add model 2025-05-27-florence_2_base_int4_en

* Add model 2025-05-27-florence_2_large_ft_int4_en

* Add model 2025-05-27-florence_2_large_int4_en

---------

Co-authored-by: prabod <[email protected]>

* 2025-05-17-internvl3_8b_int4_en (#14580)

* Add model 2025-05-17-internvl3_8b_int4_en

* Add model 2025-05-20-mmarco_mminilmv2_l12_h384_v1_nreimers_en

* Add model 2025-05-20-mmarco_mminilmv2_l12_h384_v1_nreimers_pipeline_en

* Add model 2025-05-20-bge_reranker_base_baai_en

* Add model 2025-05-20-xlm_roberta_base_language_detection_xx

* Add model 2025-05-20-bge_reranker_base_baai_pipeline_en

* Add model 2025-05-20-xlm_roberta_base_language_detection_pipeline_xx

* Add model 2025-05-20-twitter_xlm_roberta_base_sentiment_multilingual_xx

* Add model 2025-05-20-korean_reranker_ko

* Add model 2025-05-20-korean_reranker_pipeline_ko

* Add model 2025-05-20-twitter_xlm_roberta_base_sentiment_multilingual_pipeline_xx

* Add model 2025-05-20-bce_reranker_base_v1_maidalun1020_pipeline_en

* Add model 2025-05-20-bce_reranker_base_v1_maidalun1020_en

* Add model 2025-05-20-multilingual_iptc_news_topic_classifier_xx

* Add model 2025-05-20-bge_reranker_v2_m3_en

* Add model 2025-05-20-multilingual_iptc_news_topic_classifier_pipeline_xx

* Add model 2025-05-20-bge_reranker_v2_m3_pipeline_en

* Add model 2025-05-20-xlm_roberta_base_romanian_ner_ronec_ro

* Add model 2025-05-20-xlm_roberta_ner_japanese_ja

* Add model 2025-05-20-xlm_roberta_base_romanian_ner_ronec_pipeline_ro

* Add model 2025-05-20-xlm_roberta_ner_japanese_pipeline_ja

* Add model 2025-05-20-xlm_roberta_large_finetuned_conll03_english_xx

* Add model 2025-05-20-xlm_roberta_large_finetuned_conll03_german_xx

* Add model 2025-05-20-fullstop_punctuation_multilang_large_en

* Add model 2025-05-20-xlm_roberta_large_finetuned_conll03_english_pipeline_xx

* Add model 2025-05-20-fullstop_punctuation_multilang_large_pipeline_en

* Add model 2025-05-20-xlm_roberta_large_finetuned_conll03_german_pipeline_xx

* Add model 2025-05-20-xlm_roberta_large_ner_spanish_es

* Add model 2025-05-20-sent_twitter_xlm_roberta_base_en

* Add model 2025-05-20-sent_twitter_xlm_roberta_base_pipeline_en

* Add model 2025-05-20-sent_infoxlm_base_en

* Add model 2025-05-20-sent_mminilmv2_l12_h384_distilled_from_xlmr_large_en

* Add model 2025-05-20-sent_infoxlm_base_pipeline_en

* Add model 2025-05-20-sent_mminilmv2_l12_h384_distilled_from_xlmr_large_pipeline_en

* Add model 2025-05-20-sent_infoxlm_large_en

* Add model 2025-05-20-sent_xlm_roberta_large_xx

* Add model 2025-05-21-clip_vit_base_patch16_en

* Add model 2025-05-21-fashion_clip_en

* Add model 2025-05-21-clip_vit_base_patch16_pipeline_en

* Add model 2025-05-21-fashion_clip_pipeline_en

* Add model 2025-05-21-zero_shot_classifier_clip_vit_base_patch32_en

* Add model 2025-05-21-zero_shot_classifier_clip_vit_base_patch32_pipeline_en

* Add model 2025-05-21-clip_vit_large_patch14_336_en

* Add model 2025-05-21-xlmroberta_qa_ukrainian_uk

* Add model 2025-05-21-xlmroberta_qa_ukrainian_pipeline_uk

* Add model 2025-05-21-xlm_roberta_qa_xlm_roberta_base_arabic_ar

* Add model 2025-05-21-xlm_roberta_qa_xlm_roberta_base_arabic_pipeline_ar

* Add model 2025-05-21-xlm_roberta_qa_xlm_roberta_base_squad2_distilled_en

* Add model 2025-05-21-xlm_roberta_qa_xlm_roberta_base_squad2_distilled_pipeline_en

* Add model 2025-05-21-xlmr_large_qa_persian_farsi_fa

* Add model 2025-05-21-persian_xlm_roberta_large_en

* Add model 2025-05-21-xlmr_large_qa_persian_farsi_pipeline_fa

* Add model 2025-05-21-persian_xlm_roberta_large_pipeline_en

* Add model 2025-05-21-xlm_roberta_large_qa_multilingual_finedtuned_russian_xx

* Add model 2025-05-21-xlm_roberta_large_qa_multilingual_finedtuned_russian_pipeline_xx

* Add model 2025-05-21-xlm_roberta_large_xquad_en

* Add model 2025-05-21-xlm_roberta_large_xquad_pipeline_en

* Add model 2025-05-21-mminilmv2_l12_h384_distilled_from_xlmr_large_en

* Add model 2025-05-21-mminilmv2_l12_h384_distilled_from_xlmr_large_pipeline_en

* Add model 2025-05-21-twitter_xlm_roberta_base_en

* Add model 2025-05-21-twitter_xlm_roberta_base_pipeline_en

* Add model 2025-05-21-xlm_roberta_base_xx

* Add model 2025-05-21-xlm_roberta_base_pipeline_xx

* Add model 2025-05-21-infoxlm_large_en

* Add model 2025-05-21-infoxlm_base_en

* Add model 2025-05-21-infoxlm_base_pipeline_en

* Add model 2025-05-21-xlm_roberta_large_xx

* Add model 2025-05-21-xlm_v_base_xx

* Add model 2025-05-21-infoxlm_large_pipeline_en

* Add model 2025-05-21-xlm_roberta_large_pipeline_xx

* Add model 2025-05-21-xlm_v_base_pipeline_xx

* Add model 2025-05-21-robbert_v2_dutch_ner_nl

* Add model 2025-05-21-roberta_large_ner_english_en

* Add model 2025-05-21-robbert_v2_dutch_ner_pipeline_nl

* Add model 2025-05-21-roberta_large_tweetner7_all_en

* Add model 2025-05-21-roberta_large_ner_english_pipeline_en

* Add model 2025-05-21-roberta_token_classifier_sayula_popoluca_tagger_id

* Add model 2025-05-21-roberta_token_classifier_sayula_popoluca_tagger_pipeline_id

* Add model 2025-05-21-roberta_large_tweetner7_all_pipeline_en

* Add model 2025-05-22-twitter_roberta_base_sentiment_en

* Add model 2025-05-22-roberta_hate_speech_dynabench_r4_target_en

* Add model 2025-05-22-twitter_roberta_base_sentiment_latest_en

* Add model 2025-05-22-robertuito_sentiment_analysis_pipeline_es

* Add model 2025-05-22-roberta_base_go_emotions_en

* Add model 2025-05-22-roberta_hate_speech_dynabench_r4_target_pipeline_en

* Add model 2025-05-22-roberta_classifier_emotion_english_distil_base_pipeline_en

* Add model 2025-05-22-robertuito_sentiment_analysis_es

* Add model 2025-05-22-twitter_roberta_base_sentiment_latest_pipeline_en

* Add model 2025-05-22-twitter_roberta_base_sentiment_pipeline_en

* Add model 2025-05-22-roberta_classifier_emotion_english_distil_base_en

* Add model 2025-05-22-roberta_large_mnli_pipeline_en

* Add model 2025-05-22-roberta_large_mnli_en

* Add model 2025-05-22-roberta_base_go_emotions_pipeline_en

* Add model 2025-05-22-twitter_roberta_base_sentiment_latest_en

* Add model 2025-05-22-roberta_hate_speech_dynabench_r4_target_en

* Add model 2025-05-22-twitter_roberta_base_sentiment_en

* Add model 2025-05-22-robertuito_sentiment_analysis_pipeline_es

* Add model 2025-05-22-twitter_roberta_base_sentiment_pipeline_en

* Add model 2025-05-22-roberta_base_go_emotions_pipeline_en

* Add model 2025-05-22-roberta_hate_speech_dynabench_r4_target_pipeline_en

* Add model 2025-05-22-roberta_classifier_emotion_english_distil_base_en

* Add model 2025-05-22-robertuito_sentiment_analysis_es

* Add model 2025-05-22-roberta_large_mnli_en

* Add model 2025-05-22-roberta_large_mnli_pipeline_en

* Add model 2025-05-22-roberta_classifier_emotion_english_distil_base_pipeline_en

* Add model 2025-05-22-roberta_base_go_emotions_en

* Add model 2025-05-22-twitter_roberta_base_sentiment_latest_pipeline_en

* Add model 2025-05-22-distilroberta_base_en

* Add model 2025-05-22-codebert_python_en

* Add model 2025-05-22-distilroberta_base_pipeline_en

* Add model 2025-05-22-roberta_base_en

* Add model 2025-05-22-chemberta_zinc_base_v1_en

* Add model 2025-05-22-roberta_base_pipeline_en

* Add model 2025-05-22-roberta_large_en

* Add model 2025-05-22-chemberta_zinc_base_v1_pipeline_en

* Add model 2025-05-22-codebert_python_pipeline_en

* Add model 2025-05-22-roberta_large_pipeline_en

* Add model 2025-05-22-amd_power_dialer_v1_en

* Add model 2025-05-22-coherence_all_mpnet_base_v2_en

* Add model 2025-05-22-information_content_model_en

* Add model 2025-05-22-icelandic_nepal_bhasa_dataset_teacher_model_en

* Add model 2025-05-22-amd_full_phonetree_v1_pipeline_en

* Add model 2025-05-22-amd_partial_phonetree_v1_en

* Add model 2025-05-22-amd_partial_v1_en

* Add model 2025-05-22-burmese_setfit_classifier_threat_en

* Add model 2025-05-22-coherence_all_mpnet_base_v2_pipeline_en

* Add model 2025-05-22-hub_report_20241202125641_pipeline_en

* Add model 2025-05-22-amd_partial_v1_pipeline_en

* Add model 2025-05-22-amd_partial_phonetree_v1_pipeline_en

* Add model 2025-05-22-burmese_setfit_classifier_threat_pipeline_en

* Add model 2025-05-22-setfit_model_en

* Add model 2025-05-22-icelandic_nepal_bhasa_dataset_teacher_model_pipeline_en

* Add model 2025-05-22-setfit_model_pipeline_en

* Add model 2025-05-22-amd_power_dialer_v1_pipeline_en

* Add model 2025-05-22-hub_report_20241202125641_en

* Add model 2025-05-22-amd_full_phonetree_v1_en

* Add model 2025-05-22-information_content_model_pipeline_en

* Add model 2025-05-22-autotrain_kjxi3_hql8x_en

* Add model 2025-05-22-multi_qa_mpnet_base_dot_v1_finetuned_squad2_all_en

* Add model 2025-05-22-covid_qa_mpnet_en

* Add model 2025-05-22-multi_qa_mpnet_base_dot_v1_finetuned_squad2_all_pipeline_en

* Add model 2025-05-22-covid_qa_mpnet_pipeline_en

* Add model 2025-05-22-autotrain_kjxi3_hql8x_pipeline_en

* Add model 2025-05-22-multi_qa_mpnet_base_cos_v1_sentence_transformers_en

* Add model 2025-05-22-multi_qa_mpnet_base_dot_v1_en

* Add model 2025-05-22-paraphrase_mpnet_base_v2_en

* Add model 2025-05-22-patentsberta_en

* Add model 2025-05-22-all_mpnet_base_v2_sentence_transformers_pipeline_en

* Add model 2025-05-22-multi_qa_mpnet_base_cos_v1_sentence_transformers_pipeline_en

* Add model 2025-05-22-patentsberta_pipeline_en

* Add model 2025-05-22-fin_mpnet_base_en

* Add model 2025-05-22-nli_mpnet_base_v2_en

* Add model 2025-05-22-fin_mpnet_base_pipeline_en

* Add model 2025-05-22-biolord_2023_c_en

* Add model 2025-05-22-all_mpnet_base_v2_sentence_transformers_en

* Add model 2025-05-22-paraphrase_mpnet_base_v2_pipeline_en

* Add model 2025-05-22-biolord_2023_pipeline_en

* Add model 2025-05-22-multi_qa_mpnet_base_dot_v1_pipeline_en

* Add model 2025-05-22-biolord_2023_c_pipeline_en

* Add model 2025-05-22-biolord_2023_en

* Add model 2025-05-22-nli_mpnet_base_v2_pipeline_en

* Add model 2025-05-22-e5_small_v2_intfloat_en

* Add model 2025-05-22-e5_small_en

* Add model 2025-05-22-e5_small_v2_intfloat_pipeline_en

* Add model 2025-05-22-e5_small_pipeline_en

* Add model 2025-05-22-e5_base_v2_intfloat_pipeline_en

* Add model 2025-05-22-e5_base_pipeline_en

* Add model 2025-05-22-sentence_transformers_e5_large_v2_en

* Add model 2025-05-22-e5_large_en

* Add model 2025-05-22-sentence_transformers_e5_large_v2_pipeline_en

* Add model 2025-05-22-e5_base_en

* Add model 2025-05-22-e5_base_v2_intfloat_en

* Add model 2025-05-22-e5_large_pipeline_en

* Add model 2025-05-22-e5_large_v2_intfloat_en

* Add model 2025-05-22-e5_large_v2_intfloat_pipeline_en

* Add model 2025-05-24-e5_small_en

* Add model 2025-05-24-e5_small_v2_intfloat_en

* Add model 2025-05-24-e5_small_pipeline_en

* Add model 2025-05-24-e5_base_v2_intfloat_pipeline_en

* Add model 2025-05-24-e5_small_v2_intfloat_pipeline_en

* Add model 2025-05-24-sentence_transformers_e5_large_v2_en

* Add model 2025-05-24-e5_base_v2_intfloat_en

* Add model 2025-05-24-e5_large_en

* Add model 2025-05-24-e5_base_pipeline_en

* Add model 2025-05-24-sentence_transformers_e5_large_v2_pipeline_en

* Add model 2025-05-24-e5_large_v2_intfloat_en

* Add model 2025-05-24-e5_large_pipeline_en

* Add model 2025-05-24-e5_base_en

* Add model 2025-05-25-distilbert_tok_classifier_typo_detector_en

* Add model 2025-05-25-biomedical_ner_all_d4data_en

* Add model 2025-05-25-distilbert_ner_distilbert_base_cased_finetuned_conll03_english_en

* Add model 2025-05-25-distilbert_ner_distilbert_base_cased_finetuned_conll03_english_pipeline_en

* Add model 2025-05-25-distilbert_finetuned_ai4privacy_v2_en

* Add model 2025-05-25-distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_nl

* Add model 2025-05-25-biomedical_ner_all_d4data_pipeline_en

* Add model 2025-05-25-distilbert_base_multilingual_cased_pii_xx

* Add model 2025-05-25-distilbert_token_classifier_keyphrase_extraction_inspec_pipeline_en

* Add model 2025-05-25-chonky_distilbert_base_uncased_1_en

* Add model 2025-05-25-distilbert_ner_dslim_en

* Add model 2025-05-25-distilbert_tok_classifier_typo_detector_pipeline_en

* Add model 2025-05-25-chonky_distilbert_base_uncased_1_pipeline_en

* Add model 2025-05-25-distilbert_finetuned_ai4privacy_v2_pipeline_en

* Add model 2025-05-25-distilbert_base_multilingual_cased_pii_pipeline_xx

* Add model 2025-05-25-distilbert_ner_distilbert_base_multilingual_cased_ner_hrl_pipeline_nl

* Add model 2025-05-25-distilbert_ner_dslim_pipeline_en

* Add model 2025-05-25-distilbert_token_classifier_keyphrase_extraction_inspec_en

* Add model 2025-05-25-distilbert_base_uncased_go_emotions_student_en

* Add model 2025-05-25-toxic_comment_model_en

* Add model 2025-05-25-nsfw_text_classifier_en

* Add model 2025-05-25-distilbert_nsfw_text_classifier_pipeline_en

* Add model 2025-05-25-distilbert_base_uncased_go_emotions_student_pipeline_en

* Add model 2025-05-25-toxic_comment_model_pipeline_en

* Add model 2025-05-25-nsfw_text_classifier_pipeline_en

* Add model 2025-05-25-distilbert_nsfw_text_classifier_en

* Add model 2025-05-25-multilingual_sentiment_analysis_xx

* Add model 2025-05-25-multilingual_sentiment_analysis_pipeline_xx

* Add model 2025-05-27-thainer_corpus_v2_base_model_th

* Add model 2025-05-27-thainer_corpus_v2_base_model_pipeline_th

* Add model 2025-05-27-phayathaibert_thainer_th

* Add model 2025-05-27-nermembert_base_4entities_fr

* Add model 2025-05-27-cas_biomedical_sayula_popoluca_tagging_fr

* Add model 2025-05-27-phayathaibert_thainer_pipeline_th

* Add model 2025-05-27-nermembert_large_3entities_fr

* Add model 2025-05-27-nermembert_large_3entities_pipeline_fr

* Add model 2025-05-27-cas_biomedical_sayula_popoluca_tagging_pipeline_fr

* Add model 2025-05-27-nermembert_base_4entities_pipeline_fr

* Add model 2025-05-27-rubert_base_cased_nli_threeway_ru

* Add model 2025-05-27-rubert_base_cased_nli_threeway_pipeline_ru

---------

Co-authored-by: ahmedlone127 <[email protected]>

* Add model 2025-06-10-e5v_int4_en (#14599)

Co-authored-by: prabod <[email protected]>

* Add model 2025-06-23-minilm_l6_v2_en

* Add model 2025-06-22-bert_classifier_finbert_tone_en

* Add model 2025-06-22-bert_classifier_finbert_tone_pipeline_en

* Add model 2025-06-22-finbert_pipeline_en

* Add model 2025-06-22-bert_base_multilingual_uncased_sentiment_xx

* Add model 2025-06-22-bert_base_multilingual_uncased_sentiment_pipeline_xx

* Add model 2025-06-22-finbert_en

* Add model 2025-06-22-bert_base_multilingual_cased_google_bert_xx

* Add model 2025-06-22-bert_base_multilingual_cased_google_bert_pipeline_xx

* Add model 2025-06-22-bert_base_uncased_google_bert_en

* Add model 2025-06-22-bert_base_cased_google_bert_pipeline_en

* Add model 2025-06-22-bert_base_cased_google_bert_en

* Add model 2025-06-22-bert_base_uncased_google_bert_pipeline_en

* Add model 2025-06-22-sent_bert_base_multilingual_cased_xx

* Add model 2025-06-22-sent_bert_base_multilingual_cased_pipeline_xx

* Add model 2025-06-22-sent_bert_base_cased_en

* Add model 2025-06-22-sent_bert_base_cased_pipeline_en

* Add model 2025-06-22-sent_bert_base_uncased_pipeline_en

* Add model 2025-06-22-sent_bert_base_uncased_en

* Add model 2025-06-22-camembert_bio_base_fr

* Add model 2025-06-22-camembert_bio_base_pipeline_fr

* Add model 2025-06-22-drbert_7gb_fr

* Add model 2025-06-22-umberto_commoncrawl_cased_v1_it

* Add model 2025-06-22-camembert_base_fr

* Add model 2025-06-22-umberto_commoncrawl_cased_v1_pipeline_it

* Add model 2025-06-22-drbert_7gb_pipeline_fr

* Add model 2025-06-22-sloberta_pipeline_sl

* Add model 2025-06-22-camembert_base_pipeline_fr

* Add model 2025-06-22-sloberta_sl

* Add model 2025-06-22-wangchanberta_finetuned_sentiment_th

* Add model 2025-06-22-wangchanberta_finetuned_sentiment_pipeline_th

* Add model 2025-06-22-feel_italian_italian_emotion_it

* Add model 2025-06-22-feel_italian_italian_sentiment_it

* Add model 2025-06-22-finance_sentiment_french_base_fr

* Add model 2025-06-22-feel_italian_italian_emotion_pipeline_it

* Add model 2025-06-22-finance_sentiment_french_base_pipeline_fr

* Add model 2025-06-22-ag_nli_dets_sentence_similarity_v4_pipeline_xx

* Add model 2025-06-22-ag_nli_dets_sentence_similarity_v4_xx

* Add model 2025-06-22-feel_italian_italian_sentiment_pipeline_it

* Add model 2025-06-24-efficient_splade_vietnamese_bt_large_doc_en

* Add model 2025-06-24-distilbert_base_cased_en

* Add model 2025-06-24-distilbert_base_multilingual_cased_pipeline_xx

* Add model 2025-06-24-distilbert_base_german_cased_de

* Add model 2025-06-24-distilbert_base_cased_pipeline_en

* Add model 2025-06-24-distilbert_base_multilingual_cased_xx

* Add model 2025-06-24-distilbert_base_uncased_en

* Add model 2025-06-24-opensearch_neural_sparse_encoding_v2_distill_en

* Add model 2025-06-24-opensearch_neural_sparse_encoding_v2_distill_pipeline_en

* Add model 2025-06-24-distilbert_base_uncased_pipeline_en

* Add model 2025-06-24-efficient_splade_vietnamese_bt_large_doc_pipeline_en

* Add model 2025-06-24-clinicalbert_pipeline_en

* Add model 2025-06-24-opensearch_neural_sparse_encoding_doc_v2_distill_pipeline_en

* Add model 2025-06-24-opensearch_neural_sparse_encoding_doc_v2_distill_en

* Add model 2025-06-24-clinicalbert_en

* Add model 2025-06-24-distilbert_base_german_cased_pipeline_de

* Add model 2025-06-24-tiny_distilbert_base_cased_distilled_squad_en

* Add model 2025-06-24-tiny_distilbert_base_cased_distilled_squad_pipeline_en

* Add model 2025-06-24-distilbert_base_uncased_distilled_squad_distilbert_en

* Add model 2025-06-24-distilbert_base_uncased_distilled_squad_distilbert_pipeline_en

* Add model 2025-06-24-question_answering_v2_pipeline_en

* Add model 2025-06-24-distilbert_base_cased_distilled_squad_distilbert_en

* Add model 2025-06-24-distilbert_base_uncased_finetuned_squad_full_pipeline_en

* Add model 2025-06-24-distilbert_base_cased_distilled_squad_distilbert_pipeline_en

* Add model 2025-06-24-question_answering_v2_en

* Add model 2025-06-24-distilbert_base_uncased_finetuned_squad_full_en

* Add model 2025-06-24-hubert_large_japanese_asr_ja

* Add model 2025-06-24-hubert_large_arabic_egyptian_ar

* Add model 2025-06-24-hubert_large_japanese_asr_pipeline_ja

* Add model 2025-06-24-hubert_large_arabic_egyptian_pipeline_ar

* Add model 2025-06-24-distilbart_mnli_12_6_en

* Add model 2025-06-24-distilbart_mnli_12_3_en

* Add model 2025-06-24-distilbart_mnli_12_6_pipeline_en

* Add model 2025-06-24-distilbart_mnli_12_1_en

* Add model 2025-06-24-awesome_fb_model_en

* Add model 2025-06-24-distilbart_mnli_12_3_pipeline_en

* Add model 2025-06-24-distilbart_mnli_12_1_pipeline_en

* Add model 2025-06-24-distilbart_mnli_12_9_pipeline_en

* Add model 2025-06-24-bart_mnli_cnn_256_pipeline_en

* Add model 2025-06-24-distilbart_mnli_12_9_en

* Add model 2025-06-24-bart_mnli_cnn_256_en

* Add model 2025-06-24-awesome_fb_model_pipeline_en

* Add model 2025-06-24-bart_large_mnli_yahoo_answers_joeddav_pipeline_en

* Add model 2025-06-24-bart_large_mnli_yahoo_answers_joeddav_en

* Add model 2025-07-03-phi_3.5_mini_instruct_int4_en

* 2025-07-15-bge_medembed_base_v0_1_openvino_en (#14629)

* Add model 2025-07-15-bge_medembed_base_v0_1_openvino_en

* Add model 2025-07-15-bge_medembed_large_v0_1_openvino_en

* Add model 2025-07-15-all_mpnet_base_v2_openvino_en

* Update 2025-07-15-bge_medembed_base_v0_1_openvino_en.md

* Update 2025-07-15-bge_medembed_large_v0_1_openvino_en.md

* Add model 2025-07-18-nuextract_2.0_2B_en

---------

Co-authored-by: AbdullahMubeenAnwar <[email protected]>
Co-authored-by: Abdullah mubeen <[email protected]>

* 2025-07-25-phi_4_mini_instruct_q4_k_m_gguf_en (#14638)

* Add model 2025-07-25-phi_4_mini_instruct_q4_k_m_gguf_en

* Add model 2025-07-25-phi_4_mini_instruct_q8_0_gguf_en

* Add model 2025-07-25-phi_4_mini_instruct_bf16_gguf_en

* Add model 2025-07-25-Phi_4_mini_instruct_int4_openvino_en

* Update 2025-07-25-Phi_4_mini_instruct_int4_openvino_en.md

* Update 2025-07-25-Phi_4_mini_instruct_int4_openvino_en.md

* Add model 2025-07-25-phi_4_mini_instruct_int8_openvino_en

---------

Co-authored-by: AbdullahMubeenAnwar <[email protected]>
Co-authored-by: Abdullah mubeen <[email protected]>

* 2025-07-31-qwen3_4b_q4_k_m_gguf_en (#14639)

* Add model 2025-07-31-qwen3_4b_q4_k_m_gguf_en

* Add model 2025-07-31-qwen3_4b_q8_0_gguf_en

* Add model 2025-07-31-qwen3_4b_bf16_gguf_en

---------

Co-authored-by: AbdullahMubeenAnwar <[email protected]>

* 2025-08-04-Qwen3_Embedding_0.6B_Q8_0_gguf_en (#14642)

* Add model 2025-08-04-Qwen3_Embedding_0.6B_Q8_0_gguf_en

* Update 2025-08-04-Qwen3_Embedding_0.6B_Q8_0_gguf_en.md

* Add model 2025-08-04-Phi_4_mini_instruct_Q4_K_M_gguf_en

* Add model 2025-08-04-Qwen2.5_VL_3B_Instruct_Q4_K_M_gguf_en

---------

Co-authored-by: DevinTDHa <[email protected]>
Co-authored-by: Devin Ha <[email protected]>

* Add model 2025-08-11-qwen2_vl_2b_instruct_q4_gguf_en (#14648)

Co-authored-by: AbdullahMubeenAnwar <[email protected]>

* Add model 2025-09-01-bge_reranker_v2_m3_Q4_K_M_en

* Add model 2025-09-15-qwen2.5_vl_7b_instruct_q16_gguf_en (#14661)

Co-authored-by: AbdullahMubeenAnwar <[email protected]>

* 2025-11-03-umlsbert_eng_onnx_en (#14683)

* Add model 2025-11-03-umlsbert_eng_onnx_en

* Update 2025-11-03-umlsbert_eng_onnx_en.md

---------

Co-authored-by: AbdullahMubeenAnwar <[email protected]>
Co-authored-by: Abdullah mubeen <[email protected]>

* Add model 2025-11-04-bert_base_uncased_multiple_choice_en

* Add model 2025-11-07-distilbert_base_cased_en (#14688)

Co-authored-by: AbdullahMubeenAnwar <[email protected]>

* Add model 2025-11-07-bge_base_en_v1_5_onnx_en (#14690)

Co-authored-by: AbdullahMubeenAnwar <[email protected]>

* Add model 2025-11-09-glove2024_wikigiga_200d_en

---------

Co-authored-by: ahmedlone127 <[email protected]>
Co-authored-by: jsl-models <[email protected]>
Co-authored-by: prabod <[email protected]>
Co-authored-by: AbdullahMubeenAnwar <[email protected]>
Co-authored-by: Abdullah mubeen <[email protected]>
…ic… (#14701)

* SPARKNLP-1315 changing input data type for CamemBertForTokenClassification from int 64 to 32

* SPARKNLP-1315 adding test for tensorflow models
* NerDLGraphChecker add missing setter on scala side

* Introduce NerDLDataLoader for NerDLApproach

Threaded NerDLDataLoader fetches batches in the background while
training is happening in NerDLApproach, reducing idle time in the driver
thread.

* NerDLApproach: Optimize partitioning flag

Allow NerDLApproach to repartition the input dataset, so the driver does
not go out of memory when training on large partitions.

* NerDL Optimizations python side
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.