I/O Streams

Opening and Closing Streams


#include <stdio.h>

FILE * fopen (const char *filename, const char *opentype);
int fclose (FILE *stream);

fopen

The fopen function opens a stream for I/O to the file filename, and returns a pointer to the stream.

If the open fails, fopen returns a null pointer.

The opentype argument is a string that controls how the file is opened and specifies attributes of the resulting stream. It must begin with one of the following sequences of characters:

StringDescription
`r'Open an existing file for reading only.
`w'Open the file for writing only. If the file already exists, it is truncated to zero length. Otherwise a new file is created.
`a'Open a file for append access; that is, writing at the end of file only. If the file already exists, its initial contents are unchanged and output to the stream is appended to the end of the file. Otherwise, a new, empty file is created.
`r+'Open an existing file for both reading and writing. The initial contents of the file are unchanged and the initial file position is at the beginning of the file.
`w+'Open a file for both reading and writing. If the file already exists, it is truncated to zero length. Otherwise, a new file is created.
`a+'Open or create file for both reading and appending. If the file exists, its initial contents are unchanged. Otherwise, a new file is created. The initial file position for reading is at the beginning of the file, but output is always appended to the end of the file.

As you can see, `+' requests a stream that can do both input and output. The ISO standard says that when using such a stream, you must call fflush or a file positioning function such as fseek when switching from reading to writing or vice versa. Otherwise, internal buffers might not be emptied properly. The GNU C library does not have this limitation; you can do arbitrary reading and writing operations on a stream in whatever order.

Additional characters may appear after these to specify flags for the call. Always put the mode (`r', `w+', etc.) first; that is the only part you are guaranteed will be understood by all systems.

The GNU C library defines one additional character for use in opentype: the character `x' insists on creating a new file—if a file filename already exists, fopen fails rather than opening it. If you use `x' you are guaranteed that you will not clobber an existing file. This is equivalent to the O_EXCL option to the open function.

The character `b' in opentype has a standard meaning; it requests a binary stream rather than a text stream. But this makes no difference in POSIX systems (including the GNU system). If both `+' and `b' are specified, they can appear in either order.

fclose

The function fclose causes stream to be closed and the connection to the corresponding file to be broken. Any buffered output is written and any buffered input is discarded. The fclose function returns a value of 0 if the file was closed successfully, and EOF if an error was detected.

It is important to check for errors when you call fclose to close an output stream, because real, everyday errors can be detected at this time. For example, when fclose writes the remaining buffered output, it might get an error because the disk is full. Even if you know the buffer is empty, errors can still occur when closing a file if you are using NFS.

Output by Characters or Lines


#include <stdio.h>

int fputc (int c, FILE *stream);
int putc (int c, FILE *stream);
int putchar (int c);

The fputc function converts the character c to type unsigned char, and writes it to the stream stream. EOF is returned if a write error occurs; otherwise the character c is returned.

The function putc is just like fputc, except that most systems implement it as a macro, making it faster. One consequence is that it may evaluate the stream argument more than once, which is an exception to the general rule for macros. putc is usually the best function to use for writing a single character.

The putchar function is equivalent to putc with stdout as the value of the stream argument.


#include <stdio.h>

int fputs (const char *s, FILE *stream);
int puts (const char *s);

The function fputs writes the string s to the stream stream. The terminating null character is not written. This function does not add a newline character, either. It outputs only the characters in the string.

This function returns EOF if a write error occurs, and otherwise a non-negative value.

For example:

          fputs ("Are ", stdout);
          fputs ("you ", stdout);
          fputs ("hungry?\n", stdout);
     

outputs the text `Are you hungry?' followed by a newline.

The puts function writes the string s to the stream stdout followed by a newline. The terminating null character of the string is not written. (Note that fputs does not write a newline as this function does.)

puts is the most convenient function for printing simple messages. For example:

          puts ("This is a message.");
     

outputs the text `This is a message.' followed by a newline.

Input by Characters

These functions return an int that is either a character of input, or the special value EOF (usually -1). It is important to store the result of these functions in a variable of type int instead of char, even when you plan to use it only as a character. Storing EOF in a char variable truncates its value to the size of a character, so that it is no longer distinguishable from the valid character `(char) -1'. So always use an int for the result of getc and friends, and check for EOF after the call; once you've verified that the result is not EOF, you can be sure that it will fit in a `char' variable without loss of information.


#include <stdio.h>

int fgetc (FILE *stream);
int getc (FILE *stream);
int getchar ( );

fgetc reads the next character as an unsigned char from the stream stream and returns its value, converted to an int. If an end-of-file condition or read error occurs, EOF is returned instead.

getc is just like fgetc, except that it is permissible (and typical) for it to be implemented as a macro that evaluates the stream argument more than once. getc is often highly optimized, so it is usually the best function to use to read a single character.

The getchar function is equivalent to getc with stdin as the value of the stream argument.

Line-Oriented Input

Since many programs interpret input on the basis of lines, it is convenient to have functions to read a line of text from a stream.

Standard C has functions to do this, but they aren't very safe: null characters and even (for gets) long lines can confuse them. So the GNU library provides the nonstandard getline function that makes it easy to read lines reliably.

Another GNU extension, getdelim, generalizes getline. It reads a delimited record, defined as everything through the next occurrence of a specified delimiter character.


#include <stdio.h>

ssize_t getline (char **lineptr, size_t *n, FILE *stream);
char * fgets (char *s, int count, FILE *stream);

The function getline reads an entire line from stream, storing the text (including the newline and a terminating null character) in a buffer and storing the buffer address in *lineptr.

Before calling getline, you should place in *lineptr the address of a buffer *n bytes long, allocated with malloc. If this buffer is long enough to hold the line, getline stores the line in this buffer. Otherwise, getline makes the buffer bigger using realloc, storing the new buffer address back in *lineptr and the increased size back in *n.

If you set *lineptr to a null pointer, and *n to zero, before the call, then getline allocates the initial buffer for you by calling malloc.

In either case, when getline returns, *lineptr is a char * which points to the text of the line.

When getline is successful, it returns the number of characters read (including the newline, but not including the terminating null). This value enables you to distinguish null characters that are part of the line from the null character inserted as a terminator.

This function is a GNU extension, but it is the recommended way to read lines from a stream. The alternative standard functions are unreliable.

If an error occurs or end of file is reached without any bytes read, getline returns -1.

The fgets function reads characters from the stream stream up to and including a newline character and stores them in the string s, adding a null character to mark the end of the string. You must supply count characters worth of space in s, but the number of characters read is at most count − 1. The extra character space is used to hold the null character at the end of the string.

If the system is already at end of file when you call fgets, then the contents of the array s are unchanged and a null pointer is returned. A null pointer is also returned if a read error occurs. Otherwise, the return value is the pointer s.

Warning: If the input data has a null character, you can't tell. So don't use fgets unless you know the data cannot contain a null. Don't use it to read files edited by the user because, if the user inserts a null character, you should either handle it properly or print a clear error message. We recommend using getline instead of fgets.

Unreading

In parser programs it is often useful to examine the next character in the input stream without removing it from the stream. This is called “peeking ahead” at the input because your program gets a glimpse of the input it will read next.

Using stream I/O, you can peek ahead at input by first reading it and then unreading it (also called pushing it back on the stream). Unreading a character makes it available to be input again from the stream, by the next call to fgetc or other input function on that stream.

What Unreading Means

Here is a pictorial explanation of unreading. Suppose you have a stream reading a file that contains just six characters, the letters `foobar'. Suppose you have read three characters so far. The situation looks like this:

     f  o  o  b  a  r
              ^

so the next input character will be `b'.

If instead of reading `b' you unread the letter `o', you get a situation like this:

     f  o  o  b  a  r
              |
           o--
           ^

so that the next input characters will be `o' and `b'.

If you unread `9' instead of `o', you get this situation:

     f  o  o  b  a  r
              |
           9--
           ^

so that the next input characters will be `9' and `b'.

How to Unread a Character


#include <stdio.h>

int ungetc (int c, FILE *stream);

The ungetc function pushes back the character c onto the input stream stream. So the next input from stream will read c before anything else.

If c is EOF, ungetc does nothing and just returns EOF. This lets you call ungetc with the return value of getc without needing to check for an error from getc.

The character that you push back doesn't have to be the same as the last character that was actually read from the stream. In fact, it isn't necessary to actually read any characters from the stream before unreading them with ungetc! But that is a strange way to write a program; usually ungetc is used only to unread a character that was just read from the same stream.

The GNU C library only supports one character of pushback—in other words, it does not work to call ungetc twice without doing input in between. Other systems might let you push back multiple characters; then reading from the stream retrieves the characters in the reverse order that they were pushed.

Pushing back characters doesn't alter the file; only the internal buffering for the stream is affected. If a file positioning function (such as fseek, fseeko or rewind) is called, any pending pushed-back characters are discarded.

Unreading a character on a stream that is at end of file clears the end-of-file indicator for the stream, because it makes the character of input available. After you read that character, trying to read again will encounter end of file.

Block Input/Output

This section describes how to do input and output operations on blocks of data. You can use these functions to read and write binary data, as well as to read and write text in fixed-size blocks instead of by characters or lines. Binary files are typically used to read and write blocks of data in the same format as is used to represent the data in a running program. In other words, arbitrary blocks of memory—not just character or string objects—can be written to a binary file, and meaningfully read in again by the same program.

Storing data in binary form is often considerably more efficient than using the formatted I/O functions. Also, for floating-point numbers, the binary form avoids possible loss of precision in the conversion process. On the other hand, binary files can't be examined or modified easily using many standard file utilities (such as text editors), and are not portable between different implementations of the language, or different kinds of computers.


#include <stdio.h>

size_t fread (void *data, size_t size, size_t count, FILE *stream);
size_t fwrite (const void *data, size_t size, size_t count, FILE *stream);

The fread function reads up to count objects of size size into the array data, from the stream stream. It returns the number of objects actually read, which might be less than count if a read error occurs or the end of the file is reached. This function returns a value of zero (and doesn't read anything) if either size or count is zero.

If fread encounters end of file in the middle of an object, it returns the number of complete objects read, and discards the partial object. Therefore, the stream remains at the actual end of the file.

The fwrite function writes up to count objects of size size from the array data, to the stream stream. The return value is normally count, if the call succeeds. Any other value indicates some sort of error, such as running out of space.

End-Of-File (EOF) and Errors

Many of the functions described in this chapter return the value of the macro EOF to indicate unsuccessful completion of the operation. Since EOF is used to report both end of file and random errors, it's often better to use the feof function to check explicitly for end of file and ferror to check for errors. These functions check indicators that are part of the internal state of the stream object, indicators set if the appropriate condition was detected by a previous I/O operation on that stream.


#include <stdio.h>

int feof (FILE *stream);
int ferror (FILE *stream);

The feof function returns nonzero if and only if the end-of-file indicator for the stream stream is set.

The ferror function returns nonzero if and only if the error indicator for the stream stream is set, indicating that an error has occurred on a previous operation on the stream.

File Positioning

The file position of a stream describes where in the file the stream is currently reading or writing. I/O on the stream advances the file position through the file. In the GNU system, the file position is represented as an integer, which counts the number of bytes from the beginning of the file.

During I/O to an ordinary disk file, you can change the file position whenever you wish, so as to read or write any portion of the file. Some other kinds of files may also permit this. Files which support changing the file position are sometimes referred to as random-access files.


#include <stdio.h>

long int ftell (FILE *stream);
int fseek (FILE *stream, long int offset, int whence);

The ftell function returns the current file position of the stream stream.

This function can fail if the stream doesn't support file positioning, or if the file position can't be represented in a long int, and possibly for other reasons as well. If a failure occurs, a value of -1 is returned.

The fseek function is used to change the file position of the stream stream. The value of whence must be one of the constants SEEK_SET, SEEK_CUR, or SEEK_END, to indicate whether the offset is relative to the beginning of the file, the current file position, or the end of the file, respectively.

This function returns a value of zero if the operation was successful, and a nonzero value to indicate failure. A successful call also clears the end-of-file indicator of stream and discards any characters that were “pushed back” by the use of ungetc.

fseek either flushes any buffered output before setting the file position or else remembers it so it will be written later in its proper place in the file.

References

Input/Output on Streams (from the GNU C Library Reference Manual).

Neil Matthew and Richard Stone, Beginning Linux Programming, Third Edition,
Wrox, 2004. ISBN 0-7645-4497-7. p 107-117.

W. Richard Stevens and Stephen A. Rago, Advanced Programming in the UNIX Environment, Second Edition, Addison Wesley, 2005. ISBN 0-201-43307-9.


Maintained by John Loomis, last updated 4 September 2006