Computer Science and Technology: September 2012

Tuesday, 25 September 2012

Git on Amazon EC2

So here's how I'm going to be setting up deployments on my EC2 instance- I'll do a git push of my committed changes from my local machine to my remote repository (I'm hosting mine on Bitbucket; Github is great too) and just do a git pull on my EC2 instance. I should then be able to deploy a new version of my application directly. You could write a shell script to automate this (deployment).

Here's how to configure git on your instance:

sudo yum install git
cat ~/.ssh/id_rsa.pub | ssh -i amazon-generated-key.pem ec2-user@amazon-instance-public-dns "cat >> .ssh/authorized_keys" (Copies your local ssh key to Amazon. This makes sure that the message "Permission denied (publickey)" doesn't show up.)
git clone https://foobar@bitbucket.org/me/foo.git (You'll be prompted for your password. If you get the following message "fatal: Authentication failed", then check you might have special characters in your password. You'll need to change it.)

PostgreSQL 9.1 on Amazon EC2

Amazon's Elastic Block Storage (EBS) may not be a perfect place to host your database e.g., performance and reliability are common gripes (Reddit once went down for a couple of hours due to this) but it's pretty easy to work with and is fairly flexible. You can also save database snapshots. In any case, I think it would be an interesting experiment to setup a Postgres DB on EBS and potentially check how it scales.

EBS falls under the free tier, as long as your EBS storage is under 30 GB, even with multiple EBS "volumes" [source]. Here are the steps I took in setting up Postgres 9.1 on my EC2 instance using EBS as the datastore:

Go to your Amazon online console and Click the "Volumes" link located under "Elastic Block Store". The volume which is already present is the root volume.
Create a new volume. I set the size as 10GiB. Check that the availability zone is the same as the zone of your root volume.
Check your new volume and click "Attach Volume" to associate it to your EC2 instance. My device was set to /dev/sdf:

Mount your device (i.e., your EBS volume) to a folder:

sudo su - (switches user to root)
yes | mkfs -t ext3 /dev/sdf (makes an ext filesystem in your device)
mkdir /pgdata (make some directory)
mount /dev/sdf /pgdata (mount your device to this directory)
exit (exit from root)

Make yum config changes so that we're able to create our own yum repo to circumvent dependency issues:

sudo vim /etc/yum.repos.d/amzn-main.repo (At the bottom of the "[amzn-main]" section, after "enabled=1", add "exclude=postgresql*". This tells yum that we don't want to use an amazon repository for packages that meet the postgresql criteria.)
sudo vim /etc/yum.repos.d/amzn-updates.repo (Add the same exclude to the bottom of the "[amzn-updates]" section)

Download the repository/key installation rpm from pgrpms.org:

wget http://yum.pgrpms.org/reporpms/9.1/pgdg-redhat91-9.1-5.noarch.rpm
sudo rpm -ivh pgdg-redhat91-9.1-5.noarch.rpm

Since the rpm is generated for Red Hat Enterprise Linux 5, we need to make a minor change:

sudo vim /etc/yum.repos.d/pgdg-91-redhat.repo
Make the following update:

## ORIGINAL
# baseurl=http://yum.postgresql.org/9.1/redhat/rhel-$releaserver-$basearch
## UPDATED
baseurl=http://yum.postgresql.org/9.1/redhat/rhel-5-$basearch

Install PostgreSQL 9.1:

sudo yum install postgresql91-server
sudo rm -rf /pgdata/lost+found (postgres' initdb will fail to initialize a database cluster in /pgdata when there are files or directories present)
sudo chown -R postgres:postgres /pgdata (changes ownership of /pgdata to the postgres user)
sudo su -
su - postgres

Configure and launch the server:

/usr/pgsql-9.1/bin/initdb -D /pgdata
vim /pgdata/postgresql.conf
Update the line #listen_addresses = 'localhost' to listen_addresses = '*'
Update the line #port = 5432 to port = 5432

Update pg_hba.conf (for client authentication). Navigate to the bottom of the file (vim /pgdata/pg_hba.conf) and change to:

TYPE    DATABASE        USER            CIDR-ADDRESS            METHOD

# "local" is for Unix domain socket connections only
local   all             postgres                                trust
# IPv4 local connections:
host    all             power_user      0.0.0.0/0               md5  
# IPv6 local connections:
host    all             all             ::1/128                 md5

Start the server and create the database for "power_user":

/usr/pgsql-9.1/bin/pg_ctl start -D /pgdata
/usr/pgsql-9.1/bin/createuser power_user (press "y" when prompted)
/usr/pgsql-9.1/bin/psql -p 5432
postgres=# ALTER USER power_user WITH PASSWORD 'pwd';
postgres=# CREATE DATABASE foo WITH OWNER power_user;
postgres=# \q

In your Amazon console, you need to make sure that you update your security groups and add a Custom TCP rule for the port 5432. This is to access your database remotely:

The following are your database properties:

URL is jdbc:postgresql://yourwebsite.com:5432/foo (note that you can replace yourwebsite.com with the Elastic IP of your instance)
Username is power_user
Password is pwd

You can pgAdmin to connect and manage your database remotely if you want. Other tools like DbVisualizer also provide a GUI for you to view your schema and data stored in your database (DBVisualizer even allows you to view blob data).

(This post is based on Postgres 9.0 in EC2.)

Sunday, 23 September 2012

Coding in the Subway

Before joining my current company, I was determined to release my Android app, Bingle. Call it paranoia, but I didn't want any legal trouble, however remote. Just a day before starting work, I hastily tested and released my app, and headed off to visit a relative.

While on the subway, I realized that my app would throw an exception if the device wasn't connected to the internet. I immediately booted up my laptop and fixed the bug. After the train ride, I located the nearest Starbucks, connected my laptop to the wifi and released a new APK file.

Lesson learnt- don't test hastily. Also, I should've looked at my tickets on Bitbucket before releasing my app.

Friday, 21 September 2012

Software Maintenance

In university, I had this eccentric lecturer who'd repeat, almost manically, "Software Maintenance is the worst. Whatever you do, don't work for the maintenance department.". Every lesson. EVERY LESSON!

One of my slacker friends, who barely attended class, was once caught daydreaming. "Slacker, I've never seen you in class before. Do you take this course?". Without missing a beat, this guy says "Maintenance department".

(I just fixed a rather infuriating bug today and was reminded of this story.)

Monday, 17 September 2012

Amazon EC2 Basics (Part 2)

The next step would be to associate an elastic IP with your EC2 instance and setup an Apache server:

From your Amazon console, click elastic IPs and Allocate a new address.
Associate it to your EC2 instance (if you don't associate with an instance, you'll be charged).
Wherever you registered your domain name (Godaddy, Namecheap, etc), you're going to have to change the Address Record (A record) to point to your IP. e.g., for Namecheap, click "Manage domains", the "All host records", and enter your EC2 IP address for both the fields as shown below:

You should be able to head over to your registered address (e.g., foo.com) on port 8080 now and see the tomcat manager.
You might want to install Apache 2 (if you want, otherwise, you can just let Tomcat work as a web server instead of a servlet container; using Apache 2 as the server with Tomcat as the container would allow you to fiddle around with modules). In this example, we're going to be using mod_proxy to proxy requests to Tomcat. Apache listens on port 80 and Tomcat on port 8080.
Installing and starting apache is simple:

ssh into your instance
type sudo yum -y install httpd
type sudo /sbin/chkconfig httpd on
type sudo /sbin/service httpd start
/var/www/html/ is the root web directory; httpd.conf is located at /etc/httpd/conf; mod_proxy is located at /etc/httpd/modules
note that after changing httpd.conf, you should restart apache using sudo /sbin/service httpd restart

Before doing anything else, you need to modify your Amazon security group to open up port 80:

on your Amazon console, click Security Groups
check the box of your security group, click "Inbound", create a custom TCP rule for port 80 (port range would be 80) and add this rule

Also, you might want to modify port 8080 to only accept requests from your instance for added security:

on your Amazon console, click Security Groups
check the box of your security group, click "Inbound", create a custom TCP rule for port 8080 (port range would be 8080) and under "Source", add your elastic IP e.g., some.ip.foo.bar/32
if you were following Part 1 of this series, you should remove the old rule for port 8080, where the "Source" was 0.0.0.0/0

Next, modify your httpd.conf file for any context path that you wish to forward to Tomcat. In this example, I'm forwarding / to http://foo.com:8080/myapp. Note that by default, mod_proxy is already configured properly in this file (otherwise, follow this link, modifying as appropriate):

ProxyPass         /  http://foo.com:8080/myapp
ProxyPassReverse  /  http://foo.com:8080/myapp

Also modify the Tomcat server.xml file:

vim /etc/tomcat7/server.xml
Make sure that you comment out existing lines which deal with port 8080 and instead add the following line:

<Connector port="8080" proxyName="www.foo.com" proxyPort="80"/>

Restart your Tomcat and Apache, and you should be all set. Just type foo.com and you should be able to view your website.

Saturday, 8 September 2012

Amazon EC2 Basics (Part 1)

(Before we begin, note that the instructions for installing Tomcat 7 is for the version that is included with Amazon's repository. For instructions for installing the official and full Apache Tomcat version, please read Part 1.2 instead.)

Here is a short description of how to get started with an EC2 Project for absolute beginners. We'll also be exploring the following stuff along the way- Tomcat 7, Apache 2, mod_proxy.

Go to your Amazon EC2 console (online) and start up a new instance.
Follow the steps from there, its pretty straightforward. The main step you need to take care of is when setting up the security group.

You'll want to create a rule for ICMP (from the drop down), select "All". This enables ping.
You'll want to create a rule for TCP for port 22 so that you can ssh in
You'll want to create a rule for TCP for port 8080 for http

When you're done, you can start the instance from the online console itself.
Next, right click the instance and click "Connect". You'll be able to see the ssh command which you can execute via your terminal.
Run a bunch of commands

sudo yum update (to get the latest patches)
sudo yum install java-1.6.0-openjdk-devel (to get the JDK; you only have the JRE right now; if you get the full JDK, you can debug in tomcat)
sudo yum install tomcat7 (to install tomcat 7; located in /var/lib; conf is located in /etc/tomcat7)
sudo service tomcat7 start (to start tomcat)
sudo yum install tomcat7-webapps tomcat7-docs-webapp tomcat7-admin-webapps (to get the webapps, which aren't downloaded with the tomcat package)

Now, connect to your instance on port 8080 e.g., http://foo-instance.compute-1.amazonaws.com:8080/

From here, you'd probably want to navigate to the manager app, but probably won't be able to because access to the manager app is restricted by default.

vim /etc/tomcat7/tomcat-users.xml (vim is my text editor of choice, you can choose any text editor, really; if you are denied permission to change the tomcat-users.xml file, you'll need to change permissions for this file using the chmod command)
insert the following into the tomcat-users.xml file: <role rolename="manager-gui" />
also insert: <user username="tomcat" password="pwd" roles="manager-gui" />
now you need to restart tomcat (sudo service tomcat7 restart)
connect to your instance on port 8080 again, click on manager and input the username and password

You're all set! Just head on over to the tomcat manager and deploy your war files!

(This post is based on this article.)

Thursday, 6 September 2012

The research process

My paper got published recently and I'm pretty stoked about it, especially since undergrads rarely get to publish, let alone with first authorship. I'd like to take a moment to share my thoughts on the research process that culminated in this achievement.

Initially, I had decided on a project and supervisor relatively late in the semester. It was related to the Prisoners Dilemma, and I was tasked with running an arbitrary bunch of experiments to simulate agent interactions. Those first few weeks, I really thought hard. I read voraciously. The whole time, I was really under pressure to come up with something innovative and interesting. Soon, I had a rough idea. I was pretty excited about modelling noise in these agent interactions, and some realistic form of noise, something which wasn't boring. Interactions between people are affected by things like language, culture, etc (i.e., noise), and it would've been really cool to model these types of interactions. Now I needed to formalize this idea.

This was a painful process and I spent a bunch of sleepless nights crying myself to sleep (cue laughter). One fateful dinner, my group of friends, none of them CS majors, were discussing their projects. Hearing mine, one of my friends went "Hey that cultural thingy you were blabbering about- that kinda sounds like something we learnt in business class. I can't remember, but it has a name.". I then spent the next half an hour hounding her (some might call that harassment). Finally, I had the term I was looking for- pyschical distance.

The next step was far more rewarding. I thought of modelling sound waves as communication signals between agents in this Prisoner's Dilemma like game. These sound waves would be affected by the noisy effects of Psychical Distance. This was difficult- I had never taken a course in sound engineering. So I found a professor who knew things- sound things! Anyway, turns out, he couldn't help me. Another few sleepless nights followed. Finally, I gave in and visited the library. I remember hauling 10-15 sound engineering tomes to the nearby table, poring through those books in search of the elusive equation (which would help me design a function to model psychical noise). You should've seen me- disheveled hair, unshaved beard, baggy eyed. I was surprised no one mistook me for a hobo. Incredibly, I found a bunch of equations which fit perfectly into my model! That was a great day.

Fast forward presentation day, I had a really cool project which I had poured my heart and soul into. Too bad nobody really understood much of it.

Anyway, I wasn't done yet. I soon got a job, but I was quite relentless in designing new experiments in my spare time. My supervisor, suggested that we submit a paper. The publishing itself took nearly 8 months after but boy was I happy.

The whole research process was quite rewarding, but I'd like to highlight a few things I learnt-
1. Choose a good supervisor
2. Have varied interests (my project is essentially an amalgamation of comp sci, math, and business)
3. Talk to your friends
4. Textbooks in the library aren't just for show
5. Luck does play a part sometimes

Now I've not mentioned some of the more obvious and hackneyed advice that goes into research (work hard, don't give up, etc), but, uh, yeah, there's light at the end of the tunnel friends.

Wednesday, 5 September 2012

Microsoft Lumia 920 looks cool

Microsoft and Nokia recently unveiled the Lumia 920, which on face value, seems like a pretty good device. It boasts a dual core 1.5 GHz CPU, awesome camera, a nifty wireless charger (its pretty cool, but seems to add little value in my opinion), an apparently sharp display and to top it off, I personally like how the device looks. Its scheduled to be released later on this year though, presumably to avoid competing with the iPhone 5. It is a step in the right direction, but the main problem with Windows devices are apps.

Until Microsoft beef up their appstore, I personally don't see why people would go for a Microsoft device. Actually, the way I see it, its not about the number of apps, but rather, the number of good, free, apps. I really hope Microsoft are cognizant of this. The day they "fix" their appstore (and their dev framework) will be the day I'll start taking them as serious players in the smartphone market.

Saturday, 1 September 2012

A little bit about Git

Its going to be difficult for me to go back to SVN. Git has made version control a cinch; I actually look forward to creating, merging and switching branches. SVN users may recall the tedious experience of working with branches.

The magic is in the "distributed" approach that Git takes. What this means is that, you can actually get the entire history of your remote repo and store it locally (with SVN, you'd only have the latest revision, and not the whole history). Changes can be made locally, without needing to go online. It is only when you push local commits to the remote repo (so that others can fetch and merge your changes) that you need internet connectivity.

I actually remember that in my previous company, where we used SVN, sometimes, the server would go down and you'd be stuck, unable to commit and collaborate until the problem was fixed. With Git, this could've been circumvented if somebody had just pushed his copy to another server, since this copy would essentially contain the whole history of the project.

One of my favourite Git features is to stash changes. Essentially, users can choose to store their current working changes away and switch to another task immediately. They can later unstash and continue working on the older task.

Also, specific commits can be cherry picked to be applied to a different branch, partial file commits are possible- the possibilities are endless.

The only downside is that the learning curve is slightly steep initially. This shouldn't deter developers though, as the whole experience of switching from SVN to Git is immensely rewarding.