Machine Learning, python, research, resources

Deep Learning for Protein Function Prediction

Protein function prediction is taking information about a protein (such as its amino acid sequence, 2D and 3D structure etc.) and trying to predict which functions it will exhibit. This has implications in several areas of bioinformatics and affects how drugs are created and diseases are studied. This is typically an intensive task requiring inputs from biologists and computer experts alike and annotating newly found proteins requires empirical as well as computational results.

We, here at FAST NU, recently came up with a unique method (dubbed DeepSeq — since it’s based on Deep Learning and works on protein sequences!) for predicting functions of proteins using only the amino acid sequences. This is the information that is the first bit we get when a new protein is found and is thus readily available. (Other pieces require a lot more effort.)

We have successfully applied DeepSeq to predict protein function from sequences alone without requiring any input from domain experts. The paper isn’t peer reviewed yet but we have made the paper available as preprint and our full code on github so you can review it yourself.

We believe DeepSeq is going to be a breakthrough inshaallah in the field of bioinformatics and how function prediction is done. Let’s see if I can come up with an update about this in a year after the paper has been read a few times by domain experts and we have a detailed peer review.

DeepSeq

Advertisements
Announcements, Machine Learning, resources, Uncategorized, Video

Machine Learning Video Lectures

I taught an introductory Machine Learning course to BS students at FAST Peshawar in Fall 2015. The feedback was quite positive so I decided to offer another course to the MS/PhD students in the next semester. The mode of teaching was also a bit different: we tried doing the pen-tablet-augmented-multimedia-slides model. The semester is still in progress but we have the core of the basics done now.

The lectures are in Urdu so might be easier to follow for those who understand the language. I will be uploading the future videos as they come up inshaallah. You can see the first video below and follow the complete collection on Vimeo here: https://vimeo.com/album/3770825

Machine Learning – Lecture 01-A (Spring 2016) from recluze on Vimeo.

Machine Learning, resources

Machine Learning Self-Study Track

I started with Machine Learning a while back and had a slightly hard time getting help from the local community. The reason was mostly because the Machine Learning community in general is way behind the state-of-the-art in industry and research. This is true for almost all fields nowadays but with Machine Learning, the issues are more pronounced due to the recent fast-paced developments in the industry.

On the other hand, once you know what to study, things are much easier than many other fields such as security. Here I would outline the plan I followed to get to where I am (which isn’t too far ahead but still a little better than what most people know, IMHO).

So, here’s my guide for getting started with Machine Learning self-study.

  1. Start with Andrew Ng’s Coursera course — Machine Learning. That’s the advice almost everyone seems to give — and it’s a great advice. The Coursera course is completely basic and eases you in the field with little pre-reqs and not much depth. Be careful though: do not think after completing the course that you are an expert in Machine Learning. It misses quite a few areas and the skills needed to be above average. It does get you started with practicals so you are likely to think you’re already done after finishing the course.
  2. So, after you complete the courser in its entirety — including the assignments — I suggest you start with Prof. Nando de Freitas’ undergrad course.  This is a much more detailed course and would get you a very different view of ML than traditional outlines. Of course, you might have to brush up on your Probability, Calculus and Linear Algebra. You can’t really do anything without these three.
  3. For the above three, I suggest the following courses:
    1. Probability: Probability for Life Sciences by UCLA’s Math Department. You can find videos for this easily.
    2. Calculus: I strongly suggest you go with Virtual University Pakistan’s Calculus-I course by Dr. Faisal Shah Khan. It’s a great course but it’s in Urdu. If you don’t know Urdu, you can find your own series. Please let me know in the comments about great resources for this.
    3. Linear Algebra: Of course, this can only be done with Gilbert Strang’s Linear Algebra course from OCW.
  4. After that, you can start with the grad course and the second grad course by Prof. Nando de Freitas. Both have very detailed video lectures.

Of course, you also need to work with tools other than Matlab. I strongly suggest the python PyData stack. The full list would be:

  1. Python PyData full stack (plus go through their yearly videos as well)
  2. Theano
  3. Torch
  4. Keras

That’s what I have till now. I might add more when I know more inshaallah.

Geek stuff, python, resources, Tutorials

AVL Tree in Python

I’ve been teaching “Applied Algorithms and Programming Techniques” and we just reached the topic of AVL Trees. Having taught half of the AVL tree concept, I decided to code it in python — my newest adventure. Bear in mind that I have never actually coded an AVL tree before and I’m not particularly comfortable with python. I thought it would be a good idea to experiment with both of them at the same time. So, I started up my python IDE (that’s Aptana Studio, btw) and started coding.

For the newbie programmer, the code itself may not be very useful since you can find better code online. The benefit is in being able to look at the process. You can take a look at the commits I made along the way over here on github. You can take a look at how I structured the code when I began and how I added bits and pieces. This abstraction should help in solving other problems as well. The final code (along with a rigorous unit test file) can be seen here: https://github.com/recluze/python-avl-tree

Design, resources, Tutorials

How to Create a Beamer Template — A Newbie’s Tutorial

I started switching full-time to Ubuntu (once again) a couple of weeks ago. Turns out, it’s in much better condition than when I last tried it. Anyway, one of the problems was finding a replacement for Powerpoint. I hate creating presentations for classes — in fact, I think they’re counter-productive — but I have no choice for the moment. So, I decided to give LibreOffice Impress a chance. That was an hour of my life down the drain. Finally, I returned to beamer. Of course, I had to write my own theme because I couldn’t use the same theme used by all the rest of the world. To cut this long and boring story short, I tried very hard to find a tutorial on writing beamer themes, couldn’t do so, learned it through experiment and decided to write the tutorial myself. Here is that tutorial.

Continue reading “How to Create a Beamer Template — A Newbie’s Tutorial”

resources, Students

CS 303 Software Engineering (NU) Administrivia

Update Sep 08, 2011: Lectures are now available on the Lecture Server. Please get the updates there.

This is a (hopefully) temporary location for posting the contents that I want communicated to the students of CS303 Software Engineering course. These will be posted to the lecturer server as soon as I get access inshaallah. For the time being, bookmark this page and keep checking for updates.

CS303- Course Outline – Fall 2011
Slideset-01
Slideset-02

Linux, resources, Tutorials

Varnish Cache for WordPress on cPanel

Varnish is an extremely easy to configure server cache software that can help you counter the ‘slashdot effect’ — high traffic over a small period of time. The way Varnish does this is by sitting between the client and the webserver and providing cached results to the client so that the server doesn’t have to process every page. It’s better than memcache etc because the request never gets to the webserver. You can avoid one of the bottlenecks this way. In this tutorial, we’ll cover how to setup Varnish on a VPS (or dedicated server) where you have root access and are running your site using cPanel/WHM. It also applies to situations where you don’t have cPanel/WHM. You can just skip the cPanel portion if that’s the case. So, let’s get started.

Continue reading “Varnish Cache for WordPress on cPanel”