Categories
general

Google Design blog post on Indian language fonts

The new post from the Google Design blog entitled “The New Wave of Indian Type” is very interesting. The differences already noticeable in the examples of Tamil fonts in the blog post are already striking. And the blog post, of course, is showcasing fonts with open-source licenses and encouraging people to use them as starting points to create their own. I like these ideas very much.

Categories
general

Recent articles about kids, language, and technology

If you have seen the talk that Tim and I presented at Clojure/West in October on Learning Clojure through Logo, it should come as no surprise that topics of education, technology, and language are of interest. A couple of recent articles were published that caught my attention:

Categories
general

Tamil Internet Conference 2018 Call for Papers

The Tamil Internet Conference in 2018 is taking place on July 4-6 in Coimbatore, Tamil Nadu, India. Anyone interested in submitting papers should do so before March 30 (see the above link for further details).

Categories
general programming

Tips for Writing SQL Queries

In my previous job, I had a basic grasp of writing a SQL query, but I was never quite comfortable with “advanced” queries. (By “advanced”, it’s more like intermediate at best — it’s the nuances of joins, group-bys, having vs. where.) I was told that whatever SQL I didn’t know would be “easy” to pick up and would happen naturally, although in practice that never quite happened. It wasn’t until I started to come up with a system for solving interview-style programming problems that I started to similarly come up with a system for writing any SQL query. The following is the result, which is less of a “tutorial” for “beginner SQL” and more of a systematic process for constructing a SQL query:

Here are my sequential steps for writing SQL queries in a somewhat methodical way. YMMV.
Note: ‘key’, ‘column’, and ‘field’ are used interchangeably.

Categories
programming tamil thamil

Updates from the last 20 months

The last 18 months have been eventful even if my updates have been sparse. Here’s a quick rundown of some of the things that I’ve been up to:

Happy New Year, and hoping that 2018 is a good year!

Categories
programming tamil thamil

Tamil Internet Conference 2017 – Prefix Trees for Language Processing – slides and paper

The Tamil Internet Conference for 2017 in Toronto, Ontario, Canada just concluded. I presented a more in-depth explanation of my previous post on prefix trees along with specific examples of how I have used them.

Here is the full paper that I submitted for the conference proceedings, entitled “Prefix Trees (Tries) for Tamil Language Processing”. Here is the slide deck for the presentation I gave in the conference.

The following is the full text of the paper from the link above:

Categories
clojure programming

Using Clojure to Create Multi-Threaded Spark Jobs

I recently was tasked with performing an ETL task that should be done as efficiently and quickly as possible. The work led me to learn more about parallel and distributed processing in Clojure. In addition to having a greater appreciation for what Clojure enables (once again), I also pushed the boundaries of what I thought is possible using the available tools. I ultimately ended up writing a Spark job whose executors are each running N threads (currently, N=3). But the path to that solution taught as much by what didn’t work as much as what did work.

Categories
clojure general java programming

Compiling a Leiningen Project from Maven

For those of you with experience with Maven, you might be wondering why anyone who is using Leiningen to build a project would then want to run that build tool from Maven, which is itself another build tool. There is a reason why I even ventured down this path. I would like to share what I have found so far, in case it benefits anyone else, but I would also like to get feedback from people who know of a better way of accomplishing the same goals.

Categories
general programming tamil thamil

Using Prefix Trees for Thamil Language Processing

Thamil computing has made a lot of progress in the past 10-20 years. Much of the work that has reached the public has been in the areas of fonts/rendering and input methods. Thanks to the continuing efforts in these areas, most of those issues have been solved, Thamil text has standardized on a single character set (Unicode), and we have nice fonts and input methods for major operating systems and mobile devices. The new environment has enabled the widespread creation and consumption of digital content in Thamil.

Now, the next set of problems to solve are handling Thamil text that is written using the Unicode character set. Unicode is designed for all languages’ fonts to standardize, but the slight cost to Thamil language processing has been its complexity. But the challenges can be handled easily by representing the data in a suitable data structure, which in this case is a prefix tree (or “trie”).

Categories
clojure general programming tamil thamil

Speaking at Clojure/West 2015!

I’m excited to be selected as a speaker at the upcoming Clojure/West 2015 conference next month in Portland! I’ll be talking on how Clojure can be used to program in other human languages (other than English). There are interesting opportunities related to diversity and access. I will be drawing on my experiences with programming in/for Thamil in the clj-thamil library. And I’ll see what other interesting, related ideas I can slip in (turtles that draw?)… and put a bird on them.