Joe Cincotta: Thoughts and such…

Icon

Nerdism for the masses.

Why we moved to Google Code

We have been using SourceForge for a long, long time hosting all our Open Source projects. It was a tough decision to move away from SourceForge, however as our teams and projects grew – and as time went on – we felt SourceForge had lost touch with some of the fundamentals of software development for the sake of monetization.

Over 2008 I saw SourceForge move towards a services based model to attempt to support the projects which reside on it. In and of itself this is a great idea, however the real problem was that the world of web based applications had rocketed ahead whilst the core platform of SourceForge felt like it lagged behind.

Google Code was launched way back in August 2006 and has adopted the typical Google approach to its developer platform. This approach means that adding content, code and downloads is super simple and whilst the Issue Tracker in SourceForge was turned off for all our projects since it was so cumbersome, the Google Code Issue Tracker is just a joy to work with.

Overall, we feel that the impact on development and team collaboration will be vastly improved by making the move.

The Pixolüt Industries projects on Google Code are:

BizBlox

PreNIS

xReplace

Filed under: Google, Industry Opinion, Open Source, Software Development

Multilingual Searching is not International Searching

http://googleblog.blogspot.com/2007/10/helpful-suggestions-around-globe.html

I was reading this status update post from the Google team and it made me think about multilingual searching. The deeper issue of globalization is not necessarily the language – its the character set. The character set is what makes searching for Russian content from a computer in the United States very difficult. You need to set up your keyboard language to Russian and then try to figure out how to make all the symbols. This probably doesn’t seem like a big deal to someone in the United States, however when you work for a multinational company in Europe it really is.

About four years ago I was working on a project which was using the early version of BizBlox and the client was NASDAQ listed with US offices but the primary user group was a European food conglomerate. The system I was developing managed advertising assets for over 2000 brands across about ten different languages. The core problem was that some of the most important people searching the system were based in the United States and they didn’t know anything about character sets or international search – but they knew what they wanted because they knew what the words ‘looked like’.

The solution was actually really simple. For every latin based character set (non Asian, Aramaic or Sanskrit) there are similarities in letterform which can be assigned to a standard ‘sort-of’ equivalent US keyboard character or combination thereof. This solution was about leveraging the visual recognition of the end user against the foreign character set – so sometimes – especially with the Greek character set – there is more than one character which can match (not necessarily a one to one relationship between letterform similarity). The BizBlox codebase to this day has the simple version of the visual-multi-lingual character mapping table in its very simple search engine.

This concept is by no means original – it has existed for as long as multiple symbol sets and languages have… the real idea here is that there is more than one way to recognize symbols of other languages and all of them should be treated equally. To the untrained eye is the word ἄβουλος transcoded to the English character set as aboulos or abovaos? It should not matter.

By using this idea in a search query it means that – in its simplest form – a search for the letter e in a word could also find words containing è, é, ê, or ë – or using a much more complex example – a search for a word with the letters TH, T or O could find a word with the letter Ɵ in it (the Greek letter Theta).

So how does this have anything to do with Google? Well, the suggestion feature allows for powerful and truely global search -but by adding this multi-directional character search context to suggestions it would be powerful across borders as well as inside them.

[Updated a ping back to a new article which outlines some of the ideas made in this post over a year ago. http://googleblog.blogspot.com/2008/11/our-international-approach-to-search.html ]

Filed under: BizBlox, Google, Industry Opinion

Follow

Get every new post delivered to your Inbox.