Skip to content

Github removes html like tag from jupyter notebook #1

@mobassir94

Description

@mobassir94

As we are working on code mixed book readers, we had to write code for tagging text based on language. for example have a look at tag_text() function from https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_TTS.ipynb it shows :

# tag the text
for m in parts:
    if len(m.strip())>1:text=text.replace(m,f"{m}")
# clean-tags
text=text.replace("start",'')
text=text.replace("end",'')

instead of the correct code, which should be :

# tag the text
for m in parts:
    if len(m.strip())>1:text=text.replace(m,f"</ar><SPLIT><bn>{m}</bn><SPLIT><ar>")
# clean-tags
text=text.replace("</ar><SPLIT><bn>start",'<bn>')
text=text.replace("end</bn><SPLIT><ar>",'</bn>')

which can be found here in corresponding .py file (converted from ipynb into .py before uploading in github) : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_tts.py

you can see from above example that if a notebook contains html like tag inside python code then after uploading that notebook into github,github automatically replaces all those html like tags into '' which turns the original right code snippet into error. in order to avoid such unwanted errors we have uploaded all the .py converted scripts of those notebooks which are got broken after uploading in github. below i am sharing list of those problematic notebooks and their corresponding corrected .py version script for you :

  1. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_TTS.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_tts.py

  2. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir-jalalayn-book-reader-tts.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir-jalalayn-book-reader-tts.py

  3. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir_bayan_reader.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir_bayan_multilingual_(bn%2Bar)_tts_based_qtafsir_reader.py

  4. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/Multilingual_(ben%2Bara)_tts_inference_colab_demo.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/multilingual_(ben%2Bara)_tts_inference_colab_demo.py

  5. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v1_Multilingual_(ben%2Bara)_tts_based_quranic_tafsir_reader.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v1_multilingual_(ben%2Bara)_tts_based_quranic_tafsir_reader.py

  6. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v2_Multilingual_(bn%2Bar)_tts_based_Qtafsir_reader.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v2_multilingual_(bn%2Bar)_tts_based_qtafsir_reader.py

  7. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/banglanmt_tafsir-ibn-kathir-en-to-bn-translator.ipynb

original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/banglanmt_tafsir-ibn-kathir-en-to-bn-translator.py

  1. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/tafsir-al-jalalayn-en-to-bn-translator.ipynb

original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/tafsir-al-jalalayn-en-to-bn-translator.py

if you try to run those above broken notebooks locally you are expected to face errors.in order to fix those errors please check the corresponding original code(.py files) and find and replaces the line of codes where html like tags are eliminated by github automatically.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions