Categories
general

ICU4X Mailing List and 0.1 Release

For people who are interested in internationalization (i18n), they are likely writing software using ICU, the gold standard library for internationalization functionality and performance. Of course, ICU is available only in C++/C (“ICU4C”) and Java (“ICU4J”), and is quite the behemoth. In order to support other programming languages directly and to support more resource-constrained computing environments (ex: mobile), we have the ICU4X the project.

The first preliminary release, v 0.1, is now official, and the current code has been published in Rust crates.io.

To received future project announcements and to stay connected, sign up for the icu4x-announce@unicode.org mailing list.

Categories
clojure general programming tamil thamil

Deriving Lexical Data for Tamil from Scratch Using Morphology

I presented at the Unicode Conference 2 weeks ago, on Oct. 16, on important yet overlooked issues that concern languages that use abugida scripts and have agglutinative morphology, using Thamil language as a case study. Although the talk was mainly about the issues around dictionary data sets, other issues included input methods, and the need for phoneme level segmentation for these use cases. See below for more details:

Slides:

https://docs.google.com/presentation/d/1EdNLgh8MyvSqDlm2I2_aXM-WgTINaZekXZWq0629ZLQ/edit

Pre-recorded talk:

The talk covered the following topics:

Categories
general

Tamil Names

I was talking with my friend about how Tamil names differ from Western names. During the conversation, we reminisced about how he was interviewed by a local radio show on how his name is “long”. I remembered feeling unimpressed by the radio segment with my friend, and it helped explain more about Tamil names.

Categories
general

Ode to a Flame Lily

I just woke up from a dream where people were looking at a newly published book in English, and on one of the introductory dedication pages of the book was the translation of a Tamil poem. Both the book and the poem were as imaginary as the dream itself, but the first verse caught my attention and filled my senses:

Categories
general

Distractions during shelter-in-place

Most people have daily stresses during this time of Coronavirus. Working from home is lucky compared to the impact on livelihoods and health. Social distancing is necessary but does have its own little impact. For the moments where you have time to connect with friends online and recharge by taking your minds off of the state of the world, here are some options:

Categories
general

More Instant Pot & other recipes

Now that people are sheltering in place and cooking, it’s good to record some more recipes. It’s even better when they are the kind that can easily scale up (can be made in pots of any sort).

Categories
general

Instant Pot recipes for Tamil food

Here are some recipes for Tamil food using the Instant Pot. They represent the way I’ve been making these dishes recently for myself.

Categories
clojure programming

Why Clojure (Lisp) is good for writing transpilers

OmnICU is a new project to create Internationalization (i18n) functionality in multiple target languages and multiple resource-constrained runtimes. Two different approaches to solve that problem are wrapping a single common binary in multiple target language wrappers, and to write a source-to-source transpiler in a one-to-many fashion. Here are reasons why choosing Clojure (Lisp) would be a good decision for writing a transpiler.

Categories
general

Learning about staying safe from COVID-19

Daily life in the West has finally been undeniably, indefinitely disrupted by the current COVID-19 strain of Coronavirus. It’s important to take precautions, but the situation isn’t so dire yet that we can’t learn a little more about what & why. Here is the more useful information that I’ve come across so far.

Categories
programming tamil thamil

VS Code supports abugida scripts

Finding text editors that properly support the input and navigation of various scripts’ Unicode-encoded text is no longer as rare as it used to be.  Unicode has been well-established for a long time as the standard for encoding all of the world’s languages.  However, when it comes to text editors specifically for programmers (IDEs), ironically, the situation is pretty bad.  It looks like in Visual Studio Code’s most recent update, they finally have proper support for input and navigation of abugida scripts, or as they’re alternatively called, alphasyllabaries. The animated picture in the VS Code update page shows someone typing and navigating Tamil text, but the change should actually apply to several languages across East Africa, South Asia, and Southeast Asia.