Bijankhan is a large tagged corpus in Persian language. Unfortunately this corpus is not in utf-8 format and also has a lot of misspelled words. I tried to normalized Bijankhan corpus by some simple replacements.
موارد اصلاح شده:
Simply run the
bijankhan.py to normalize the corpus. for help type 'python bijankhan.py -h'
If you like this project, please donate or consider becoming a patron:
Using or modifying this project has no limitation due to its license (GNU v3.0). But if you like to use Bijankhan corpus, you need to contact to the owner.
🇮🇷 PrettyPersianNumbers is an Android library for converting a number to word for Persian/Farsi language written in Kotlin
Graphical User Interface for Negar -- Persian Text Editor
:lion: Django Persian Pastebin Service + ⚡ RESTful APIs
Persian Spell Checking Dictionary
An OCR for finding numbers for Banks of Iran using Opencv and Scikit-learn in Python
An anthology of a variety of tools for the Persian language in Python
cross-platform demo for text-mining.ir APIs