general i18n internationalization programming unicode

More Sources to Learn Internationalization

Here is a collection of more links to learn about internationalization, beyond the really good videos from the recent Unicode event and other links that I mentioned along with them. I might continually update this space to organize all of the links for learning about internationalization, organized by what might be most useful to total beginners.

  • Getting Started with Internationalization – A quick, well-written intro to basic terms and concepts with illustrative pictures
  • Unicode Demystified by Rich Gillam – a book written in simple language that explains the technical aspects of Unicode, gives a survey of writing systems / languages, and an overview of the higher level things. You should probably read this before reading the Unicode core specification, if you ever need to do that.
C++ i18n internationalization java Rust unicode

Introduction to Internationalization and Unicode

The Unicode Consortium had its online event last week entitled “Overview of Internationalization and Unicode Projects”. The 6 videos have an average runtime less than 15 minutes and create a nice, gradual explanation of the internationalization / Unicode ecosystem that nicely builds upon itself. The videos are also embedded below.

A colleague asked me a good way to get a background of internationalization. Here is what I told that person:

general programming tamil thamil

Redesigning an Input Method for an Abugida Script

After I previously talked about problems of input methods for abugida scripts, and added more supporting details to the point, I finally started prototyping possible implementations of the idea (try it out!).

But there are quite a few constraints and tradeoffs that come up once you start thinking about the details. I think these issues apply generally to most abugida scripts. So I am documenting all of the details below. Also, getting a new input method adopted requires more than perfecting just the technical details and user experience — it also requires overcoming user inertia (or creating awareness), and it also requires educating industry experts and those implementing changes. If you have feedback, please send it my way so that I can continue to update this post with the latest information.

clojure java programming Rust web

Choose the exemplar programming language for the use case

I think my understanding of programming languages — ex: what role do they serve in tech engineering work, what my favorite language is — is something that continues to evolve, and has done so once more recently. Here is where I started from and where I stand:

clojure programming

Random Bits of Lisp/Clojure Hype

A friend sent me a link to Uncle Bob’s blog post on Clojure (2019), where he explains his road from hating Lisp to appreciating it to the point where its simplicity and power makes it his favorite language. He declared it a language for the ages, just as I did. Given that Uncle Bob is a co-founder of the influential Agile Manifesto and has used several different types of languages, his declaration has more reach, and he explained it more concisely than I could, anyways. And what about Paul Graham, whose ideas have a large audience and hasn’t stopped writing about Lisp’s secret superpowers? He’s been apparently creating a new language Bel, a version of Lisp that is defined in itself, and the premise’s challenge is like a math/logic puzzle that resonates with the history of Lisp as an unintended result of implementing math theorems in a computer. But he seemed to have liked Uncle Bob’s post, and in recent years, when asked, he also recommends Clojure as the flavor of Lisp to use for modern times (1, 2, 3, 4, 5).

clojure general programming Rust

Learning Rust for Beginners

Rust is a new-ish language that is very compelling in certain contexts, but learning it has a really deceptive learning curve, so I wanted to provide the links that I have found most effective for slow learning beginners like myself, especially because the “official” Rust book(s) are to me paradoxically hard to learn from despite being thorough.

clojure general programming tamil thamil

Deriving Lexical Data for Tamil from Scratch Using Morphology

I presented at the Unicode Conference 2 weeks ago, on Oct. 16, on important yet overlooked issues that concern languages that use abugida scripts and have agglutinative morphology, using Thamil language as a case study. Although the talk was mainly about the issues around dictionary data sets, other issues included input methods, and the need for phoneme level segmentation for these use cases. See below for more details:


Pre-recorded talk:

The talk covered the following topics:

clojure programming

Why Clojure (Lisp) is good for writing transpilers

OmnICU is a new project to create Internationalization (i18n) functionality in multiple target languages and multiple resource-constrained runtimes. Two different approaches to solve that problem are wrapping a single common binary in multiple target language wrappers, and to write a source-to-source transpiler in a one-to-many fashion. Here are reasons why choosing Clojure (Lisp) would be a good decision for writing a transpiler.

programming tamil thamil

VS Code supports abugida scripts

Finding text editors that properly support the input and navigation of various scripts’ Unicode-encoded text is no longer as rare as it used to be.  Unicode has been well-established for a long time as the standard for encoding all of the world’s languages.  However, when it comes to text editors specifically for programmers (IDEs), ironically, the situation is pretty bad.  It looks like in Visual Studio Code’s most recent update, they finally have proper support for input and navigation of abugida scripts, or as they’re alternatively called, alphasyllabaries. The animated picture in the VS Code update page shows someone typing and navigating Tamil text, but the change should actually apply to several languages across East Africa, South Asia, and Southeast Asia.

clojure java programming

Helper code to mimic Clojure fns in Scala

I’ve finished my 3.5 year stint writing Scala, and I haven’t stopped missing writing Clojure. The knowledge of Clojure continues to heighten and inform my programmer sensibilities. One thing that I appreciated about Scala is that it was as good of a medium as you might practically find to allow writing Clojure without writing Clojure. I liked to think of Scala as the canvas on which I painted my Clojure ideas. Because Scala makes itself amenable to many styles of programming at once (at least, FP and OOP), it was possible to write code by imagining what the Clojure code would look like, and then writing that in Scala syntax. Interestingly, the more I did this, and the more faithfully I did so, the more people implicitly (no pun intended!) acknowledged the code as “good Scala code”. Because, you know, most Scala programmers agree that good Scala code puts “val”s at the top of a function body, uses immutable collections exclusively, prefers functions over (object) methods, and makes functions small, stateless, and composable. More on that later. Here, I want to simply release some of the code that I wrote in Scala to fill in a few perceived gaps in Scala’s Seq abstraction, where the perception is based on what I was accustomed to using in Clojure.