Wednesday, March 30, 2005

My 43 things

Find out what are my goals in life...

Thursday, March 24, 2005

Copernic Desktop Search Plug-in Development

Recently, I published a tutorial on how to program custom extractor for Copernic Desktop Search.

I found another blogger called Oliver Sturm that published another and, well, better tutorial ;)

An implementation of a Copernic Desktop Search Custom Extractor in C#

Tuesday, March 22, 2005

ASP.NET: Do using informative variable name is a good programming practice?

I found I could reduce my ASP.NET page size by about 50% simply by using non-informative variable name...

Indeed, when a datagrid contains some widgets, they are repeated using their real variable name... Imagine this line of html beign repeated thousand of times:

input id="OccurencesDataGrid__ctl125_VALID" type="radio" name="OccurencesDataGrid:_ctl125:JudgeChoice" value="VALID";

Size can easily be reduce by changing the datagrid name to "ODG" instead of "OccurencesDataGrid", changing radio button group from "JudgeChoice" to "JC" and changing enumeration value from "Valid" to "V"...

Incredible size gain.. but unreadable code :(

Wednesday, March 16, 2005

A morning like any other...

Yahoo will soon launch Yah00 360, Google offers Google X for mac addicts, LeRéverbère published yet another excellent article... This is a morning like any other.

Tuesday, March 15, 2005

Detect the client screen size in ASP.NET

I run into this problem where I have to use an hard-coded value for the size of a DataGrid in a web page. The reason for using a pixel size instead of a percentage of the page size is to have this nice "Freeze the Header, Scroll the Grid" rendering as explained on asp.net pro.

However, different client screen resolution means different grid size..

Here's a little trick to check the client screen resolution and adjust the grid to fit well..

Friday, March 11, 2005

Java Lint (Code Analyzer)

Jlint does an honest job at anylizing Java code and finding potential bugs.

It is not too much paranoiac (like IDEA' code inspector, for instance) and does not enforce by-the-book OO programing.

To use it with Eclipse, download the Jlint engine on sourceforge and use the Eclipse plug-in offered by a third party.

Wednesday, March 09, 2005

A lexicon for machine translation

A lexicon for :

Machine Translation - techniques for allowing construction workers and architects from all over the world to communicate better with each other so they can get back to work on that really tall tower.

Artificial Intelligence - plot device used in movies like 2001, Terminator, and Matrix. Nothing to worry about.

Model - a highly simplified, idealized version of a real process or object, often found in Computational Linguistics journal and Vogue.

N-gram - measurement of the weightiness of words and phrases. For example, "and" weighs 1 gram, while "mass destruction" weighs 2 grams.

Romanization - the insidious global- cultural process of forcing French people to watch the movie "Gladiator."

Translation Memory - situation in which a translation system has already seen the source text translated before. Contrasts with translation deja vu, in which system only thinks it has seen the source text before.

[read it all]

Tuesday, March 08, 2005

Indexing postscript (.ps) files with Copernic Desktop Search

The latest version of Copernic Desktop Search allows the integration of custom "file indexers". In this post, I present a C# plug-in that allows indexing .ps files.

First, you should read the CDS' API. It explains how to write and install the plug-in. The code you need to produce is the following (in a C# dll project). It uses Ghostscript gswin32c.exe:

[Guid("8FF27441-6E42-485f-8D73-5DD7E1969BE9")]
public class CDSpsIndexing: ICopernicDesktopSearchFileExtractor {

public CDSpsIndexing() { string strUri = "";}

private string strUri;

public void LoadURI([MarshalAs(UnmanagedType.BStr)]string URI){ strUri = URI;}

[return : MarshalAs(UnmanagedType.IUnknown)]
public object GetContentStream() {

string tempFile = Path.GetTempFileName();

Process myProcess = new Process();

myProcess.StartInfo.FileName = "C:/gs/gs8.13/bin/gswin32c";
myProcess.StartInfo.Arguments = "-q -dNODISPLAY -dSAFER
-dDELAYBIND -dWRITESYSTEMDICT -dSIMPLE
ps2ascii.ps \"" + strUri + "\" -c quit";
myProcess.StartInfo.CreateNoWindow = true;
myProcess.StartInfo.UseShellExecute = false;
myProcess.StartInfo.RedirectStandardOutput = true;
myProcess.Start();

TextWriter tw = new StreamWriter(tempFile);
tw.Write(myProcess.StandardOutput.ReadToEnd());
tw.Close();

myProcess.WaitForExit();

StreamReader sr = new StreamReader(tempFile);
CDSStream outputStream = new CDSStream(sr.BaseStream);
return outputStream;
}

public bool IsContentUnicode{ get { return false; }}

[ ComRegisterFunctionAttribute ]
public static void DllRegisterServer(string registrationLogic) {
Console.WriteLine("Registering ...");
}

[ ComUnregisterFunctionAttribute ]
public static void DllUnregisterServer(string unregistrationLogic) {
Console.WriteLine("Unregistering ...");
}
}


The class"CDSStream" implements the "UCOMIStream" interface:

public class CDSStream : UCOMIStream
{


private Stream m_Stream;

public CDSStream(Stream pi_Stream)
{
m_Stream = pi_Stream;
m_Stream.Seek(0, SeekOrigin.Begin);
}


public void Read(byte[] pv, int cb, IntPtr pcbRead)
{
int nRead = 0;
nRead = m_Stream.Read(pv, 0, cb);
if (pcbRead != IntPtr.Zero)
{
Marshal.WriteInt32(pcbRead, nRead);
}

}

public void Stat(out STATSTG pstatstg, int grfStatFlag)
{
pstatstg = new STATSTG();
pstatstg.cbSize = m_Stream.Length;
}


... (remaining methods can assert false)

}

Don't forget to install the plugin by modifying Copernic registry (click to enlarge):

Image hosted by Photobucket.com

In practice there is a problem. Copernic is designed to stop working if the computer ressources are in use.. So, when the custom indexer works on a large file, CDS suspend itself. It then resume by trying to index the same file (a kind of infinite loop). To work around this problem, you should disable the option "suspend indexing while computer ressources are highly used"

Image hosted by Photobucket.com

Tuesday, March 01, 2005

Yahoo opens up its search toolbox to developers

The Sunnyvale, Calif.-based company has created the Yahoo Search Developer Network. The network will allow software developers to create new applications on top of Yahoo search.

The big differences between Yahoo and Google API are:
  1. Yahoo uses REST instead of SOAP protocol;
  2. Yahoo has not the 1000 queries per key limit. Instead, the limit vary for each service (web, image, ...) and is, for instance 5,000 for web. However, if the IP of the sender of the query changes, the query bank changes. That means virtually unlimited queries when a service is published and used by different users.