IT has been quiet on the SALSAdev blog for a moment. Well we have been busy putting together the pieces of our latest semantic prototype, news.NET. Please read on after the jump, but first, I would like to take this opportunity to thank you all for your patience, support and especially for all the feedback that […]
Sep 01, 2008 | By: Stephane | No Comments
It was not intended to use SALSA as a classification system. Many other tools exists that do a fair job. Yet as a side effect of mental semantic representation SALSA is capable of classifying unstructured-data by itself. Not only is this a nice feature but it’s computation is light. Meaning SALSA is a potential technology […]
Apr 24, 2008 | By: Stephane | 2 Comments
It almost feels like going back to basics, but we have done it: Taking apart our own technology in order to deploy a light version of it.
Mar 16, 2008 | By: Stephane | No Comments
While analyzing non-structured or dirty data it is sometimes hard to discriminate strings that are not actual words (garbage, tags, typos, …). In this case, the need of an automated method to differentiate actual words from garbage is of great help. While several approaches exist I will demonstrate two methods available at no cost with Open Source Software:
Feb 15, 2008 | By: Stephane | No Comments
A common problem when working with internet data is “What can I do about all these tags”. Cleaning html can be a daunting task. A simple work around is to use the help of a text-based web browser.
Feb 12, 2008 | By: Stephane | No Comments