ISCII - The Desi SCII

Thanks to Nareshov, I worked quite a bit on ISCII and managed to churn out some code that helps to convert it into UTF8. The script is quite general and can be quickly modded for other SCII to UTF8 conversions.

The Script requires you to have a CSV file with the mappings from the SCII to Unicode written in it. The CSV can be written by looking into your SCII table and the corresponding Unicode one available http://www.unicode.org/charts/

Example ISCII-Telugu CSV and the Python Code : scii2utf8.zip

Be sure to see the output in a terminal that understands UTF8.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import csv
 
filetable = open("iscii-telugu.csv","rb")
convtable_length=95 #length of the csv above.
 
reader=csv.reader(filetable)
convtable={}
lang={}
icing=0
 
for row in reader:
	if(icing == 0):
		lang[row.pop(0)] =  row.pop(1)
		icing =1
		print "Language conversion between" 
                #Just a bit of icing
		print  lang
	else:
		if(icing<convtable_length+1):
	           convtable[int(row.pop(0),16)] = int(row.pop(1),16)
 
#print convtable
 
def code2utf(a):
	if(a>6)&63)+128)+chr((a&63)+128),"utf-8")
 
def conv2utf(s):
	stri=u""
	for let in s:
		if(convtable.__contains__(ord(let))):
			stri=stri+code2utf(convtable[ord(let)])
		else:
			stri=stri+let
	return stri.encode('utf-8')
 
print conv2utf(unicode("ÍÚÂÍÚÌ¢ µÂÏ×¢ ÈÞÂÛ ÈÏèÍÝÖÛ¢ ¸ ÍÂèê ",'utf-8')) 
# Some test lines from the gita.