Machine Learning Self-Study Track

I started with Machine Learning a while back and had a slightly hard time getting help from the local community. The reason was mostly because the Machine Learning community in general is way behind the state-of-the-art in industry and research. This is true for almost all fields nowadays but with Machine Learning, the issues are more pronounced due to the recent fast-paced developments in the industry.

On the other hand, once you know what to study, things are much easier than many other fields such as security. Here I would outline the plan I followed to get to where I am (which isn’t too far ahead but still a little better than what most people know, IMHO).

So, here’s my guide for getting started with Machine Learning self-study.

  1. Start with Andrew Ng’s Coursera course — Machine Learning. That’s the advice almost everyone seems to give — and it’s a great advice. The Coursera course is completely basic and eases you in the field with little pre-reqs and not much depth. Be careful though: do not think after completing the course that you are an expert in Machine Learning. It misses quite a few areas and the skills needed to be above average. It does get you started with practicals so you are likely to think you’re already done after finishing the course.
  2. So, after you complete the courser in its entirety — including the assignments — I suggest you start with Prof. Nando de Freitas’ undergrad course.  This is a much more detailed course and would get you a very different view of ML than traditional outlines. Of course, you might have to brush up on your Probability, Calculus and Linear Algebra. You can’t really do anything without these three.
  3. For the above three, I suggest the following courses:
    1. Probability: Probability for Life Sciences by UCLA’s Math Department. You can find videos for this easily.
    2. Calculus: I strongly suggest you go with Virtual University Pakistan’s Calculus-I course by Dr. Faisal Shah Khan. It’s a great course but it’s in Urdu. If you don’t know Urdu, you can find your own series. Please let me know in the comments about great resources for this.
    3. Linear Algebra: Of course, this can only be done with Gilbert Strang’s Linear Algebra course from OCW.
  4. After that, you can start with the grad course and the second grad course by Prof. Nando de Freitas. Both have very detailed video lectures.

Of course, you also need to work with tools other than Matlab. I strongly suggest the python PyData stack. The full list would be:

  1. Python PyData full stack (plus go through their yearly videos as well)
  2. Theano
  3. Torch
  4. Keras

That’s what I have till now. I might add more when I know more inshaallah.


Learning How to Learn

I’ve just started with another Coursera course — this one about learning in general. The course is called Learning How to Learn: Powerful mental tools to help you master tough subjects. It’s actually a fairly easy going course, as far as I can see. The assignments and quizzes are fairly straight forward for the most part but the important bit is that the instructors share their life experiences about learning. I hope to be able to get through this course — I have enough ambition that I’ve even signed up for the paid “Signature Track” version of the course.

One important mental tool that I found really interesting is how to use the diffused thought model to get new ideas regarding difficult to solve problems. It’s best explained in the videos through Edison’s example: He would sit on his chair and let his hand hang on a side — while holding a few ball bearings in it. He would then relax and let his mind wander, drifting off towards sleep. The mind would shift to diffused thinking and would eventually find some new avenue to explore to help solve the issue at hand. This usually happens when you’re about to fall asleep — and that is where the ball bearings come into play. They would fall down creating a bit of a racket pulling him back from sleep so that he could grasp the fledging ideas and put them on paper. Cool trick!

Backup Using rsync

Here’s a mini howto on backing up files  on a remote machine using rsync. It shows the progress while it does its thing and updates any remote files while keeping files on the remote end that were deleted from your local folder.

rsync -v -r --update --progress -e ssh /media/nam/Documents/ nam@

Here,  /media/nam/Documents/ is the local folder and /media/nam/backup/documents/ is the backup folder on the machine with IP

How to Access Google Adsense Reports

So, Admob was acquired a while ago by Google and it was recently announced that the publisher reports by Admob would no longer be available through the old APIs. Instead, they now have to be retrieved through the AdSense API — which is based on OAuth 2.0 and thus a real pain for those just getting started.

Turns out, the process is quite straight-forward but extremely poorly documented. You can go through the AdSense reporting docs, the Google API library and the OAuth 2.0 specs but you would soon be lost. After spending a couple of days decoding the requirements, I found out the bare-metal approach to accessing the stats. And here is how.

Read More »

Back to WordPress

Well, that was short-lived. I moved away from WordPress — only to come back after around 6 months and one post. Seems like Octopress is too much of a hassle for someone as unstable as me. Maybe another time when I’m more focused.

Hadoop 2.2.0 – Single Node Cluster

We’re going to use the the Hadoop tarball we compiled earlier to run a pseudo-cluster. That means we will run a one-node cluster on a single machine. If you haven’t already read the tutorial on building the tarball, please head over and do that first.

Geting started with Hadoop 2.2.0 — Building

Start up your (virtual) machine and login as the user ‘hadoop’. First, we’re going to setup the essentials required to run Hadoop. By the way, if you are running a VM, I suggest you kill the machine used for building Hadoop and re-start from a fresh instance of Ubuntu to avoid any issues with compatibility later. For reference, the OS we are using is 64-bit Ubuntu 12.04.3 LTS.

Read More »