Skip to content Skip to sidebar Skip to footer

Read and Write Line by Line From All Text Files to Webpage

Reading and Writing Text Files

Overview

Education: threescore min
Exercises: 30 min

Questions

  • How tin I read in data that is stored in a file or write data out to a file?

Objectives

  • Be able to open a file and read in the data stored in that file

  • Understand the divergence between the file proper noun, the opened file object, and the data read in from the file

  • Be able to write output to a text file with simple formatting

Why practise we want to read and write files?

Being able to open up and read in files allows us to piece of work with larger data sets, where it wouldn't be possible to type in each and every value and shop them one-at-a-time equally variables. Writing files allows us to process our information then save the output to a file and then we tin look at information technology later.

Right now, we will practice working with a comma-delimited text file (.csv) that contains several columns of data. However, what you lot learn in this lesson tin can be practical to any general text file. In the side by side lesson, you volition learn another mode to read and process .csv data.

Paths to files

In order to open a file, we demand to tell Python exactly where the file is located, relative to where Python is currently working (the working directory). In Spyder, we tin exercise this by setting our current working directory to the folder where the file is located. Or, when we provide the file name, we tin give a complete path to the file.

Lesson Setup

We will work with the do file Plates_output_simple.csv.

  1. Locate the file Plates_output_simple.csv in the directory abode/Desktop/workshops/fustigate-git-python.
  2. Re-create the file to your working directory, dwelling/Desktop/workshops/YourName.
  3. Make sure that your working directory is also set to the folder home/Desktop/workshops/YourName.
  4. Every bit you are working, brand sure that you lot save your file opening script(s) to this directory.

The File Setup

Let'southward open and examine the construction of the file Plates_output_simple.csv. If you open the file in a text editor, you will meet that the file contains several lines of text.

DataFileRaw

Nonetheless, this is fairly difficult to read. If you open the file in a spreadsheet programme such as LibreOfficeCalc or Excel, you tin see that the file is organized into columns, with each cavalcade separated past the commas in the image above (hence the file extension .csv, which stands for comma-separated values).

DataFileColumns

The file contains one header row, followed by eight rows of data. Each row represents a single plate prototype. If we look at the column headings, we can meet that we have nerveless data for each plate:

  • The name of the image from which the information was collected
  • The plate number (there were 4 plates, with each plate imaged at 2 different time points)
  • The growth condition (either command or experimental)
  • The observation timepoint (either 24 or 48 hours)
  • Colony count for the plate
  • The average colony size for the plate
  • The percentage of the plate covered by bacterial colonies

We will read in this information file and and then piece of work to analyze the information.

Opening and reading files is a three-step process

We volition open and read the file in 3 steps.

  1. We will create a variable to hold the name of the file that we desire to open.
  2. We will call a open to open the file.
  3. We will call a function to actually read the data in the file and store information technology in a variable so that we can procedure it.

And and so, there's one more step to do!

  • When we are washed, we should remember to shut the file!

You lot can call up of these iii steps every bit beingness similar to checking out a volume from the library. Start, you lot have to go to the catalog or database to find out which book you need (the filename). Then, you have to go and get it off the shelf and open the book upwards (the open office). Finally, to gain any information from the book, y'all have to read the words (the read function)!

Hither is an instance of opening, reading, and closing a file.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #This is simply a string of text              #Open the file              infile              =              open              (              filename              ,              'r'              )              # 'r' says we are opening the file to read, infile is the opened file object that we will read from              #Store the data from the file in a variable              information              =              infile              .              read              ()              #Print the data in the file              print              (              data              )              #close the file              infile              .              close              ()                      

Once we accept read the information in the file into our variable data, we tin can care for it similar any other variable in our code.

Use consistent names to make your code clearer

It is a skillful idea to develop some consistent habits nigh the way you open up and read files. Using the same (or similar!) variable names each fourth dimension will make it easier for you to keep track of which variable is the name of the file, which variable is the opened file object, and which variable contains the read-in data.

In these examples, nosotros will use filename for the text string containing the file proper noun, infile for the open up file object from which we can read in data, and information for the variable property the contents of the file.

Commands for reading in files

There are a variety of commands that allow us to read in information from files.
infile.read() will read in the entire file as a single cord of text.
infile.readline() will read in one line at a time (each time yous call this command, information technology reads in the next line).
infile.readlines() volition read all of the lines into a listing, where each line of the file is an particular in the list.

Mixing these commands can have some unexpected results.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #Open the file              infile              =              open              (              filename              ,              'r'              )              #Print the start two lines of the file              print              (              infile              .              readline              ())              print              (              infile              .              readline              ())              #call infile.read()              print              (              infile              .              read              ())              #close the file              infile              .              close              ()                      

Find that the infile.read()command started at the tertiary line of the file, where the first two infile.readline() commands left off.

Think of it like this: when the file is opened, a pointer is placed at the top left corner of the file at the beginning of the first line. Any fourth dimension a read role is called, the cursor or pointer advances from where it already is. The first infile.readline() started at the offset of the file and advanced to the terminate of the offset line. Now, the arrow is positioned at the beginning of the second line. The second infile.readline() avant-garde to the end of the 2d line of the file, and left the pointer positioned at the beginning of the 3rd line. infile.read() began from this position, and avant-garde through to the end of the file.

In general, if you want to switch betwixt the different kinds of read commands, you lot should close the file and then open it over again to start over.

Reading all of the lines of a file into a listing

infile.readlines() volition read all of the lines into a list, where each line of the file is an item in the listing. This is extremely useful, because once nosotros have read the file in this way, we can loop through each line of the file and process it. This approach works well on data files where the information is organized into columns like to a spreadsheet, because it is likely that nosotros will want to handle each line in the same way.

The case below demonstrates this approach:

                          #Create a variable for the file proper noun              filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open up              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              #lines is a listing with each item representing a line of the file              if              'control'              in              line              :              print              (              line              )              #print lines for control status              infile              .              close              ()              #close the file when you're done!                      

Using .split() to separate "columns"

Since our information is in a .csv file, we can use the separate command to dissever each line of the file into a list. This can be useful if we want to access specific columns of the file.

                          #Create a variable for the file name                            filename              =              "Plates_output_simple.csv"              #Open up the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              sline              =              line              .              split              (              ','              )              # separates line into a list of items.  ',' tells it to split the lines at the commas              print              (              sline              )              #each line is now a list              infile              .              shut              ()              #Always close the file!                      

Consistent names, once again

At first glance, the variable name sline in the example higher up may not make much sense. In fact, we chose it to be an abbreviation for "split line", which exactly describes the contents of the variable.

You don't have to use this naming convention if y'all don't want to, but you should piece of work to use consistent variable names across your code for mutual operations like this. Information technology will make it much easier to open up an old script and quickly understand exactly what it is doing.

Converting text to numbers

When we called the readlines() command in the previous lawmaking, Python reads in the contents of the file as a string. If we want our lawmaking to recognize something in the file as a number, we need to tell it this!

For example, float('5.0') will tell Python to treat the text cord '5.0' every bit the number 5.0. int(sline[four]) volition tell our code to care for the text string stored in the 5th position of the list sline equally an integer (non-decimal) number.

For each line in the file, the ColonyCount is stored in the fifth cavalcade (index 4 with our 0-based counting).
Modify the code above to print the line just if the ColonyCount is greater than thirty.

Solution

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the start line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to split the lines at the commas                  colonyCount                  =                  int                  (                  sline                  [                  4                  ])                  #store the colony count for the line as an integer                  if                  colonyCount                  >                  thirty                  :                  print                  (                  sline                  )                  #close the file                  infile                  .                  shut                  ()                              

Writing data out to a file

Often, we volition want to write data to a new file. This is especially useful if we accept done a lot of computations or data processing and we want to be able to save it and come back to it later.

Writing a file is the same multi-step process

Just like reading a file, we volition open and write the file in multiple steps.

  1. Create a variable to hold the name of the file that we want to open. Often, this volition be a new file that doesn't yet exist.
  2. Call a function to open the file. This fourth dimension, nosotros volition specify that we are opening the file to write into it!
  3. Write the information into the file. This requires some conscientious attention to formatting.
  4. When we are done, nosotros should remember to shut the file!

The lawmaking beneath gives an example of writing to a file:

                          filename              =              "output.txt"              #w tells python we are opening the file to write into information technology              outfile              =              open              (              filename              ,              'w'              )              outfile              .              write              (              "This is the first line of the file"              )              outfile              .              write              (              "This is the 2d line of the file"              )              outfile              .              close              ()              #Close the file when nosotros're washed!                      

Where did my file cease up?

Whatsoever time you open a new file and write to it, the file will be saved in your current working directory, unless you specified a different path in the variable filename.

Newline characters

When you examine the file you just wrote, you volition come across that all of the text is on the same line! This is because we must tell Python when to start on a new line past using the special cord character '\n'. This newline character will tell Python exactly where to start each new line.

The case below demonstrates how to use newline characters:

                          filename              =              'output_newlines.txt'              #west tells python nosotros are opening the file to write into it              outfile              =              open              (              filename              ,              'w'              )              outfile              .              write              (              "This is the first line of the file              \n              "              )              outfile              .              write              (              "This is the second line of the file              \n              "              )              outfile              .              close              ()              #Shut the file when we're done!                      

Go open the file y'all merely wrote and and check that the lines are spaced correctly.:

Dealing with newline characters when yous read a file

You may have noticed in the concluding file reading example that the printed output included newline characters at the end of each line of the file:

['colonies02.tif', 'two', 'exp', '24', '84', 'iii.2', '22\n']
['colonies03.tif', '3', 'exp', '24', '792', '3', '78\north']
['colonies06.tif', 'two', 'exp', '48', '85', 'five.2', '46\due north']

We can get rid of these newlines by using the .strip() function, which will get rid of newline characters:

                              #Create a variable for the file name                filename                =                'Plates_output_simple.csv'                ##Open the file                infile                =                open                (                filename                ,                'r'                )                lines                =                infile                .                readlines                ()                for                line                in                lines                [                i                :]:                #skip the get-go line, which is the header                sline                =                line                .                strip                ()                #get rid of trailing newline characters at the terminate of the line                sline                =                sline                .                split                (                ','                )                # separates line into a list of items.  ',' tells it to dissever the lines at the commas                colonyCount                =                int                (                sline                [                4                ])                #store the colony count for the line as an integer                if                colonyCount                >                xxx                :                print                (                sline                )                #close the file                infile                .                close                ()                          

Writing numbers to files

Just similar Python automatically reads files in as strings, the write()function expects to only write strings. If we desire to write numbers to a file, we will demand to "cast" them as strings using the function str().

The code below shows an example of this:

                          numbers              =              range              (              0              ,              10              )              filename              =              "output_numbers.txt"              #w tells python we are opening the file to write into information technology              outfile              =              open              (              filename              ,              'w'              )              for              number              in              numbers              :              outfile              .              write              (              str              (              number              ))              outfile              .              shut              ()              #Shut the file when we're done!                      

Writing new lines and numbers

Become open and examine the file yous only wrote. You will see that all of the numbers are written on the same line.

Alter the lawmaking to write each number on its ain line.

Solution

                                  numbers                  =                  range                  (                  0                  ,                  10                  )                  #Create the range of numbers                  filename                  =                  "output_numbers.txt"                  #provide the file name                  #open up the file in 'write' mode                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  for                  number                  in                  numbers                  :                  outfile                  .                  write                  (                  str                  (                  number                  )                  +                  '                  \n                  '                  )                  outfile                  .                  shut                  ()                  #Shut the file when nosotros're done!                              

The file you just wrote should exist saved in your Working Directory. Open up the file and cheque that the output is correctly formatted with one number on each line.

Opening files in unlike 'modes'

When we accept opened files to read or write data, we accept used the function parameter 'r' or 'w' to specify which "way" to open up the file.
'r' indicates nosotros are opening the file to read data from it.
'w' indicates we are opening the file to write data into information technology.

Be very, very careful when opening an existing file in 'w' fashion.
'w' will over-write any data that is already in the file! The overwritten data will be lost!

If you want to add on to what is already in the file (instead of erasing and over-writing information technology), you can open the file in append manner by using the 'a' parameter instead.

Pulling it all together

Read in the data from the file Plates_output_simple.csv that nosotros have been working with. Write a new csv-formatted file that contains simply the rows for command plates.
You will need to do the following steps:

  1. Open the file.
  2. Apply .readlines() to create a list of lines in the file. Then shut the file!
  3. Open a file to write your output into.
  4. Write the header line of the output file.
  5. Use a for loop to permit y'all to loop through each line in the list of lines from the input file.
  6. For each line, check if the growth condition was experimental or control.
  7. For the control lines, write the line of data to the output file.
  8. Close the output file when you're done!

Solution

Here's 1 way to practice it:

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We will process the lines of the file later                  #close the input file                  infile                  .                  close                  ()                  #Create the file we will write to                  filename                  =                  'ControlPlatesData.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  outfile                  .                  write                  (                  lines                  [                  0                  ])                  #This will write the header line of the file                                    for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells information technology to split the lines at the commas                  condition                  =                  sline                  [                  ii                  ]                  #store the condition for the line as a cord                  if                  condition                  ==                  "control"                  :                  outfile                  .                  write                  (                  line                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Close the file when we're done!                              

Challenge Problem

Open up and read in the data from Plates_output_simple.csv. Write a new csv-formatted file that contains only the rows for the control status and includes only the columns for Time, colonyCount, avgColonySize, and percentColonyArea. Hint: you lot can utilise the .join() function to join a list of items into a string.

                              names                =                [                'Erin'                ,                'Mark'                ,                'Tessa'                ]                nameString                =                ', '                .                bring together                (                names                )                #the ', ' tells Python to join the listing with each item separated by a comma + space                impress                (                nameString                )                          

'Erin, Mark, Tessa'

Solution

                                  #Create a variable for the input file proper name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #Nosotros will process the lines of the file subsequently                  #shut the file                  infile                  .                  close                  ()                  # Create the file nosotros will write to                  filename                  =                  'ControlPlatesData_Reduced.txt'                  outfile                  =                  open                  (                  filename                  ,                  'west'                  )                  #Write the header line                  headerList                  =                  lines                  [                  0                  ]                  .                  divide                  (                  ','                  )[                  three                  :]                  #This volition return the listing of column headers from 'time' on                  headerString                  =                  ','                  .                  join                  (                  headerList                  )                  #join the items in the list with commas                  outfile                  .                  write                  (                  headerString                  )                  #There is already a newline at the end, so no need to add one                  #Write the remaining lines                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  separate                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to split up the lines at the commas                  condition                  =                  sline                  [                  2                  ]                  #store the colony count for the line as an integer                  if                  condition                  ==                  "control"                  :                  dataList                  =                  sline                  [                  3                  :]                  dataString                  =                  ','                  .                  join                  (                  dataList                  )                  outfile                  .                  write                  (                  dataString                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Close the file when we're washed!                              

Key Points

  • Opening and reading a file is a multistep process: Defining the filename, opening the file, and reading the data

  • Data stored in files can be read in using a variety of commands

  • Writing information to a file requires attention to data types and formatting that isn't necessary with a print() statement

sinclairdoem1958.blogspot.com

Source: https://eldoyle.github.io/PythonIntro/08-ReadingandWritingTextFiles/

ارسال یک نظر for "Read and Write Line by Line From All Text Files to Webpage"