Wednesday, June 30, 2010

Cooper Union and Scalable Program Design

Today was my official first day in the Chemistry summer program at the Cooper Union for the Advancement of Science and Art. I quickly found the program would be way more computer science and web development than chemistry, and was overjoyed. Our project has to do with spectroscopy, or seeing how much light and what colors are absorbed by a substance. When you measure spectroscopy, it yields a graph of how much light is absorbed over its frequency, and this graph can actually be used to identify the substance being measured. Our project goal is to create a web interface that, when given a file of various formats containing such a graph, can quickly and efficiently search a library of pre-existing graphs and accurately determine what substance the input belong to. Put simply, a website with some algorithms in the back end.

But the more interesting aspect of what has been going on only the first day. One of the mentors assigned to the project is Robert Marano (@robmarano), who is the CEO of InDorse Technologies, a startup focusing on file assurance and protection that recently made the Red Herring list of top startup companies. While discussing what road we should travel down for our project, he insisted on a cloud-based operation using Hadoop and HBase. The Cooper Union undergraduate students acting as mentors at the program tried to argue and stick to the original C/PHP/Javascript combination that had been working just fine, but they were quickly rebutted. (They were using C for the backend to compare graphs and execute algorithsm, PHP to serve content, and Javascript for AJAX and stuff.) Keep in mind this is a website primarily for chemists looking up spectroscopy data, something many people did not even know of until the first paragraph of this post.

The point I am trying to bring across is mainly that while it is always cool to say your website is hosted on Amazon's EC2 and uses Hadoop with some NoSQL database and can handle as many users as Google, it is not necessary. Such an effort is effectively a waste of time, especially when we already have some basic code written. The way I see it is that when you make a website, always expect to get ten times as many users as you would hope for, so that you are always prepared, but definitely do not expect any more, especially when your service is used for such a limited application as chemical spectroscopy. What I hope to achieve by the end of the week is convince them to use my WebApi that is already written and works pretty nicely, but we will see how everything progresses. Hope summer is going great for everyone. If anybody has any other suggestions for the project, feel free to comment.

No comments:

Post a Comment