[Python] Read data into array

This article is a follow-up of my previous perl script to manipulate data in a file. After I posted my question on a social website, I've received a lot of people encouraging me to do it in Python.

Some advantages of Python over Perl are: easier to maintain, easier for other people to read, and even google programmers use it. And there is claim out there saying that if you know C++, you could easily pick up Python. (Alright, I have to admit that the main reason is that googlers use it ..)

So I started learning Python and trying to write a Python script to do the same task:

  1. read in data from a file
  2. and store elements in arrays
  3. do simple calculations
  4. output result

In this Python script, it also takes input file name in the argument. First we need to import two modules:

   import os
   import sys
The os module allows us to work with files and directories. The sys module enables use to take arguments.

   filename = sys.argv[1]
As in C++, the first argument (argv[0]) is the name of your script. Here we assign the variable filename to our input file.

   lines = [line.strip() for line in open(filename).readlines()]
This statement reads in lines from the file, and strip off the spaces before and after each line. I personally found this command pretty intuitive. One thing I need to get used to is the use of for ... in . This is how for loop is used in Python, which is very different from C++. Now each of the lines in the file are stored in lines. To access each element in it:
   for i in lines:
       row = i.split()
       id.append(int(row[0]))
       y.append(float(row[1]))
The split() function split each elements into respective items (row[0] and row[1] in this case). The split() function can take delimiters, i.e. split('\t'). However I found it works the best just leaving it blank. The id and y arrays are used to store values in each column. We need to claim them before they are being used:
  id = []
  y  = []
The [] denotes empty array.

Once we have the values stored in arrays, we can easily manipulate them. To output the results, we could use the open function:

  output = open("output.txt",'w')
  for i in range(len(y))
        output.write("%2d\t %12f\n" %(id[i],y[i]))
  output.close
The % sign helps us output the format we want.

I might spend more time writing this Python script than I did my for perl one, but it is easier to follow. And also because the indent rules python requires, so the code looks more neat.

Hooray! My first Python code!!