Well, that was short-lived. I moved away from WordPress — only to come back after around 6 months and one post. Seems like Octopress is too much of a hassle for someone as unstable as me. Maybe another time when I’m more focused.
We’re going to use the the Hadoop tarball we compiled earlier to run a pseudo-cluster. That means we will run a one-node cluster on a single machine. If you haven’t already read the tutorial on building the tarball, please head over and do that first.
Start up your (virtual) machine and login as the user ‘hadoop’. First, we’re going to setup the essentials required to run Hadoop. By the way, if you are running a VM, I suggest you kill the machine used for building Hadoop and re-start from a fresh instance of Ubuntu to avoid any issues with compatibility later. For reference, the OS we are using is 64-bit Ubuntu 12.04.3 LTS.
I wrote a tutorial on getting started with Hadoop back in the day (around mid 2010). Turns out that the distro has moved on quite a bit with the latest versions. The tutorial is unlikely to work. I tried setting up Hadoop on a single-node “cluster” using Michael Knoll’s excellent tutorial but that too was out of date. And of course, the official documentation on Hadoop’s site is lame.
Having struggled for two days, I finally got the steps smoothed out and this is an effort to document it for future use.
I return with a minor post after another long break. This time, it’s about writing better English. Now, this isn’t humblebragging but I cannot be considered excellent at English writing — at least not by native standards. English is not my first language and I haven’t had much formal English education. I have, however, read a lot. Even if my English is not good, I can still point out some tips shared by experts.
Here’s the first one of those shared by Amanda Patterson on Writers Write. It’s a list of 45 words you can use to put emphasis on words without using the word “very”. I found it refreshingly helpful.
Bear in mind though that you cannot just go ahead and use a word without looking up its usage examples. Some words might have negative connotations even though the dictionary meanings look positive. For example, if you use the word ‘adequate‘ to describe someone’s work, they might be offended even though the dictionary meaning is that of acceptable quality.
p.s. After writing this, I searched for the word “very” and found two instances where I had used the word myself. I replaced it with better alternatives.
So you’ve started working with Django and you love the admin interface that you get for free with your models. You deploy half of your app with the admin interface and are about to release when you figure out that anyone who can modify a model can do anything with it. There is no concept of “ownership” of records!
Let me give you an example. Let’s say we’re creating a little MIS for the computer science department where each faculty member can put in his courses and record the course execution (what was done per lecture). That would be a nice application. (In fact, it’s available open source on github and that is what this tutorial is referring to.) However, the issue is that all instructors can access all the course records and there is no way of ensuring that an instructor can modify only the courses that s/he taught. This isn’t easily possible because admin doesn’t not have “row-level permissions”.
I’ve been teaching “Applied Algorithms and Programming Techniques” and we just reached the topic of AVL Trees. Having taught half of the AVL tree concept, I decided to code it in python — my newest adventure. Bear in mind that I have never actually coded an AVL tree before and I’m not particularly comfortable with python. I thought it would be a good idea to experiment with both of them at the same time. So, I started up my python IDE (that’s Aptana Studio, btw) and started coding.
For the newbie programmer, the code itself may not be very useful since you can find better code online. The benefit is in being able to look at the process. You can take a look at the commits I made along the way over here on github. You can take a look at how I structured the code when I began and how I added bits and pieces. This abstraction should help in solving other problems as well. The final code (along with a rigorous unit test file) can be seen here: https://github.com/recluze/python-avl-tree
Back when I first became interested in science, one of the first areas I was interested in was Biology — specifically human evolution because of the hype it gets. The whole religion versus science becomes very pronounced in the creation-or-evolution debate. I read through a lot of material and was able to understand natural selection quite easily. It’s not all that complicated as some would like to suggest. It’s simply this: whoever is the better in a particular situation survives. It’s easy to understand once you realize that that’s a tautology. Whoever survives is the “fittest” and the fittest survives. So, no problems there but there was always something that didn’t quite fit. I could never quite digest the “theory” that micro-evolution (birds changing bone shape etc.) could lead to macro evolution (going from single-strand RNA to fish).
Finally, after much thought, I realized something. The way evolution is normally explained is by half a process of induction. The proponents of evolution (by the way, in the rest of this post, “evolution” should be read as “macro-evolution”) suggest that there is a gradual change from one species to another. You get to see a lot of “minor changes” and finally, with some blanks, you can see the whole chain. The base case, however, is missing. Where does this process start?
According to Darwinian evolution, if you go back in time, you go back in complexity. From complex mammals, you get fish and from there, you get stuff like amoebas, and then very simple living material like RNA etc. The problem with that though, is that there comes a point where you can’t get any simpler. If you do, your “living thing” cannot reproduce. The reason is that reproduction is a fairly complicated process and anything that does it can’t said to be the basic organism. However, if you get any simpler, you lose the ability to reproduce and then you cannot demonstrate survival of the fittest because no matter how fit you are, you cannot pass on your traits.
Now, after a long time, I came across this article on MIT Technology Review which documents an interview with George Whitesides . George Whitesides is introduced in the article with these words:
Harvard professor George Whitesides has spent his career solving problems in science and industry—he cofounded the pharmaceutical giant Genzyme, and he’s the world’s most cited living chemist.
Please read through the brief interview. It’s very informative and though provoking. Of relevance here is the answer to the first question quoted here for the sake of completeness.
Technology Review: What’s the problem you have most wanted to solve and haven’t been able to?
Whitesides: There’s an intellectual problem, which is the origin of life. The origin of life has the characteristic that there’s something in there as a chemist, which I just don’t understand. I don’t understand how you go from a system that’s random chemicals to something that becomes, in a sense, a Darwinian set of reactions that are getting more complicated spontaneously. I just don’t understand how that works. So that’s a scientific problem.
That’s a concise and succinct way of explaining the problem that I just introduced. If you get simpler beyond a certain point, you cannot obey Darwinian set of reactions (i.e. survival of the fittest). So, the question is: why aren’t we told about this problem in the Darwinian theory when we’re all taught evolution in school? It’s not that hard to explain.