Sometimes, when reading a CSV file, you may only want to read specific rows. For example, you might want to skip every other row or read only a particular row of information. In this blog post, we will explain how to read specific rows from a CSV file using Python, providing concrete examples and explanations.
Reading Specific Rows
Implementing a program to read specific rows becomes easy if you know how to read a file line by line. By doing so, you can simply ignore the unnecessary rows.
In the following sections, we will introduce two methods: (1) reading every other row and (2) reading a specific row.
Reading Every Other Row
Implementing a program to read every other row is straightforward by using conditional branching with an if statement.
Preparing the File
First, prepare a CSV file like the one below and place it in Desktop/LabCode/python/data-analysis/Input_File_Specrow
.
# averaged temperature in 2018 @ Kyoto city
# 01: month 02: averaged temperature in the daytime
1,3.9
2,4.4
3,10.9
4,16.4
5,20.0
6,23.4
7,29.8
8,29.5
9,23.6
10,18.7
11,13.5
12,8.2
Implementing the Program (Sample Code)
The code will look like this:
import re
input_data = open('AvrgTmp_Kyoto2018.csv', 'r')
num = 0
for row in input_data:
if not re.match('#', row):
if num % 2 == 0:
split_row = row.rstrip('\n').split(',')
month = split_row[0]
ave_temperature = split_row[1]
print(month, ave_temperature)
num += 1
input_data.close()
Output
After navigating to the directory cd Desktop/LabCode/python/data-analysis/Input_File_Specrow
in the terminal, running the program will produce the following output:
python input_file.py
# (Output)
# 1 3.9
# 3 10.9
# 5 20.0
# 7 29.8
# 9 23.6
# 11 13.5
Code Explanation
num = 0
for row in input_data:
num += 1
Before reading the file, we define an integer variable num
with an initial value of 0. After each line is read, we increment num
by 1 (num += 1
is equivalent to num = num + 1
). Thus, num
corresponds to the line number: 0 corresponds to the first line, 1 corresponds to the second line, and so on.
Once we have established the correspondence between num
and the lines, we can use the remainder operator (%
) to perform conditional branching and read the data every other line.
if num % 2 == 0:
By writing if num % 2 == 0
, the subsequent operations will only be executed when num
is even. This can be applied to read data every third line by using if num % 3 == 0
. By setting the initial value of num
to something other than 0, you can skip the first line or make various adjustments. Feel free to experiment.
*Note: This article does not cover the method of reading lines one by one or excluding comment parts. Please be aware of this.
Reading a Specific Row
Reading a specific row can be achieved by making a slight modification to the previous code. Since num
corresponds to the line numbers (e.g., 0 corresponds to the first line), if you want to retrieve only the fifth line, you can use if num == 4
. Below is the sample code and output.
import re
input_data = open('AvrgTmp_Kyoto2018.csv', 'r')
num = 0
for row in input_data:
if not re.match('#', row):
if num == 4:
split_row = row.rstrip('\n').split(',')
month = split_row[0]
ave_temperature = split_row[1]
print(month, ave_temperature)
num += 1
input_data.close()
python input_file.py
# (Output)
# 5 20.0
Summary
In this blog post, we explained how to read specific rows from a file using Python. To read specific rows, you need to define an integer variable (num
) that corresponds to the line numbers. For example:
- To read every other row, use
if num % 2 == 0
for conditional branching. - To read a specific row, use
if num == number
and specify the desired row number.
By employing these techniques, you can read the data of the desired rows.