Working on glossarys (dictionary databases) using python. Including editing glossarys and converting theme between many formats. The support matrix is,
| Format | Extension | Read | Write |
|---|---|---|---|
| ABBYY Lingvo DSL | .dsl | X | |
| AppleDict Source | .xml | X | |
| Babylon | .bgl | X | |
| Babylon Source | .gls | X | |
| DictionaryForMIDs | X | ||
| DICTD dictionary server | .index | X | X |
| FreeDict | .tei | X | |
| Gettext Source | .po | X | X |
| SQLite | MDic .m2 Sib .sdb | X | X |
| Octopus MDic | .mdx | X | |
| Octopus MDic Source | .txt | X | X |
| Omnidic | X | X | |
| PMD | X | X | |
| Sdictionary Binary | .dct | X | |
| Sdictionary Source | .sdct | X | |
| SQL | X | ||
| StarDict | .ifo | X | X |
| Tabfile | .txt, .dic | X | X |
| TreeDict | X | ||
| XDXF | .xdxf | X | |
| xFarDic | .xdb | X | X |
BeautifulSoup4(with html5lib as backend) required to sanitize html contents.
sudo easy_install beautifulsoup4 html5libGNU make as part of Command Line Tools for Xcode.
Dictionary Development Kit as part of Auxillary Tools for Xcode. Extract to
/Developer/Extras/Dictionary Development Kit
Let's assume the Babylon dict is at ~/Documents/Duden_Synonym/Duden_Synonym.BGL:
cd ~/Documents/Duden_Synonym/ python ~/Software/pyglossary/pyglossary.pyw --read-options=resPath=OtherResources --write-format=AppleDict Duden_Synonym.BGL Duden_Synonym.xml make make install
Launch Dictionary.app and test.
Let's assume the MDict dict is at ~/Documents/Duden-Oxford/Duden-Oxford DEED ver.20110408.mdx.
Use GetDict to extract Mdict dictionary (.mdx). Choose "UTF-8 TXT" output format and
Duden-Oxford DEED ver.20110408.mtxtoutput file name.Run the following command:
cd ~/Documents/Duden-Oxford/ python ~/Software/pyglossary/pyglossary.pyw "Duden-Oxford DEED ver.20110408.mtxt" "Duden-Oxford DEED ver.20110408.xml" make make install
Launch Dictionary.app and test.
Let's assume the MDict dict is at ~/Downloads/oald8/oald8.mdx, along with the image/audio resources file oald8.mdd.
Run the following commands:
cd ~/Downloads/oald8/ python ~/Software/pyglossary/pyglossary.pyw --read-options=resPath=OtherResources --write-format=AppleDict oald8.mdx oald8.xml
This extracts dictionary into oald8.xml and data resources into folder OtherResources.
Hyperlinks use relative path.
sed -i "" 's:src="/:src=":g' oald8.xml
Convert audio file from SPX format to WAV format.
find OtherResources -name "*.spx" -execdir sh -c 'spx={};speexdec $spx ${spx%.*}.wav' \;
sed -i "" 's|sound://\([/_a-zA-Z0-9]*\).spx|\1.wav|g' oald8.xml
Compile and install.
make make install
Launch Dictionary.app and test.