Input and Output
The C language
provides no direct facilities for input and output (IO), and, instead, these
operations are supplied as functions in the standard library.
1.
Formatted IO
2.
File IO
3.
Command-Shell Redirection
4.
Command-Line Arguments
1.
Formatted IO
1.1Formatted
Output: printf()
The function
printf() is a general purpose print function that converts and formats its
arguments to a character string, and prints the result to standard output (typically
the screen). The general interface for printf() is
int printf(const char *format, arg1, arg2, ...);
The first argument
is a format string (The
format string is composed of ordinary characters and conversion specification
characters.), which defines the layout of the printed text.
This is followed
by zero or more optional arguments, with the number of arguments, and their
type, being determined by the contents of the format string.
The return value
is the number of characters printed, unless an error occurs during output
whereupon the return value is EOF.
Conversion
specifications are identified by a % character followed by a number of optional
fields and terminated by a type conversion character. A simple example is
printf("%d green %s sitting on a
wall.\n", 10, "bottles");
where the ordinary
characters “green” and “sitting on a wall.\n” are printed verbatim, and the conversion
specifiers %d and %s insert the additional arguments at the appropriate
locations. The type conversion character must match its associated argument
type; in the example, the %d indicates an integer argument and the %s indicates
a string argument.

There are
different conversion characters for ints (d, i, o, x, c), unsigned ints (u),
doubles (f, e, g), strings (s), and pointers (p). Details of these may be found
in any C reference text.
To print a %
character, the conversion specification %% is used. Between the % and the type
conversion character there may exist a number of optional fields.
These control the
formatting of the converted argument. Consider, for example, the conversion specifier
%-#012.4hd
1.2
Formatted Input: scanf()
The scanf()
function is the input analog of printf(), providing many of the same conversion
specifications in the opposite direction (although there are differences, so be
wary).
It obtains data from
standard input, which is typically the keyboard. The general interface
for scanf() is
int scanf(const char *format, ...);
This is identical
to printf() in form, with a format string and a variable argument list, but an important
difference is that the arguments for scanf() must be pointer types. This allows
the input data to be stored at the address designated by the pointer using
pass-by-reference semantics.
For example,
double fval;
scanf("%lf", fval); /* Wrong */
scanf("%lf", &fval); /*
Correct, store input in fval. */
scanf() reads
characters from standard input and interprets them according to the format string
specification.
It stops when it
exhausts the format string, or when some input fails to match a conversion
specification. Its return value is the number of values successfully assigned
in its variable-length argument list.
If a conflict
occurs between the conversion specification and the actual input, the character
causing the conflict is left unread and is processed by the next standard input
operation.
The mechanics of
the format string and its conversion specifications are even more complicated for
scanf() than for printf(), and there are many details and caveats that will not
be discussed here.
Most of the
conversion characters for printf()—d, i, o, x, c, u, f, e, g, s, p, etc—have
similar meanings for scanf(), but there are certain differences, some subtle.
Thus, one should
not use the documentation for one as a guide for the other. Some of these
differences are as follows.
•
Where
printf() has four optional fields, scanf() has only two. It has the width and
size modifier fields but not the flags and precision fields.
•
For
printf() the width field specifies a minimum reserve of space (i.e.,
padding), while for scanf() it defines a maximum limit on the number of
characters to be read.
•
An
asterisk character (*) may be used in place of the width field for both
printf() and scanf(), but with different meanings. For printf() it allows the
width field to be determined by an additional argument, but for scanf() it
suppresses assignment of an input value to its argument.
•
The
conversion character [ is not valid for printf(), but for scanf() it permits a scanset
of characters to be specified, which allows scanf() to control exactly the
characters it reads in.
•
The
size modifier field is typically neglected for printf(), but is vital for
scanf(). For example, to read a float, one uses the conversion specifier %f. To
read a double, the size modifier l (for long) must appear, %lf.
The scanf() format
string consists of conversion specifiers, ordinary characters, and white-space.
For example, the following statement is used to read a date of the form
dd/mm/yy.
int
day, month, year;
scanf("%d/%d/%d",
&day, &month, &year);
In general scanf()
ignores white-space characters in its format string, and skips over white-space
in stdin as it looks for input values. Exceptions to this rule arise with the
%c and %[ conversion specifiers, which do not skip white-space. For example, if
the user types in “one two” for each of the statements below, they will obtain
different results.
char
s[10], c;
scanf("%s%c",
s, &c); /* s = "one", c = ’ ’ */
scanf("%s
%c", s, &c); /* s = "one", c = ’t’ */
In the first case,
the %c reads in the next character after %s leaves off, which is a space. In
the second, the white-space in the format string causes scanf() to consume any
white-space after “one”, leaving the first non-space character (t) to be
assigned to c.
While the many
details of scanf() formatting complicates a complete understanding, its basic use
is quite simple. Rarely does an input statement get more complicated than
short a;
double b;
char c[20];
scanf("%hd %lf %s", &a, &b, c);
A few final
warnings about scanf().
First, keep in
mind that the arguments in its variable length argument list must be
pointers; forgetting the & in front of non-pointer variables is a very common
mistake.
Second, when there
is a conflict between a conversion specification and the actual
input, the
offending character is left unread. Thus, an expression like while
(scanf("%d", &val) != EOF) is dangerous as it will loop forever
if there is a conflict. Third, while scanf() is a good choice when the exact
format of the input is known, other input techniques may be better suited if
the format may vary. For example, the combination of fgets() and sscanf(),
described in the next section, is a useful alternative if the input format is
not precisely known. The fgets() function reads a line of characters into a
buffer, and sscanf() extracts the data, and can pick out different parts using multiple
passes if necessary.
1.3 String Formatting
The functions sprintf()
and sscanf() perform essentially the same operations as printf() and scanf(),
respectively, but, rather than interact with stdout or stdin, they operate on a
character array argument. They present the following interfaces.
int sprintf(char *buf, const char *format, ...);
int sscanf(const char *buf, const char *format, ...);
The sprintf() function
stores the resulting formatted string in buf and automatically appends this
string with a terminating \0 character. It returns the number of characters
stored (excluding \0). This function is very useful for a wide range of string
manipulation operations. For example, the following code segment creates a
format string at runtime, which prevents scanf() from overflowing its character
buffer.
char buf[100], format[10];
sprintf(format, "%%%ds", sizeof(buf)-1); /*
Create format string "%99s". */
scanf(format, buf); /* Get string from stdin. */
The input string
is thus limited to not more than 99 characters plus 1 for the terminating \0. sscanf()
extracts values from the string buf according to the format string, and stores
the results in the additional argument list. It behaves just like scanf() with buf
replacing stdin as the source of input characters. An attempt to read beyond
the end of string buf for sscanf() is equivalent to reaching the end-of-file
for scanf(). The sscanf() function is often used in conjunction with a line input
function, such as fgets(), as in the following example.
char buf[100];
double dval;
fgets(buf, sizeof(buf), stdin); /* Get a line of
input, store in buf. */
sscanf(buf, "%lf", &dval); /* Extract a
double from buf. */
2.
File IO
The C language is
closely tied to the UNIX operating system; they were initially developed in parallel,
and UNIX was implemented in C. Thus, much of the standard C library is modelled
on UNIX facilities, and in particular the way it performs input and output by
reading or writing to files.
2.1 Opening and
Closing Files
A file is referred
to by a FILE pointer, where FILE is a structure declaration defined with a typedef
in header stdio.h. This file pointer “points to a structure that contains
information about the file, such as the location of a buffer, the current
character position in the buffer, whether the file is being read or written,
and whether errors or end-of-file have occurred”. All these implementation
details are hidden from users of the standard library via the FILE type-name
and the associated library functions.
è A file is opened
by the function fopen(), which has the interface
FILE *fopen(const char *name, const char *mode);
The first argument,
name, is a character string containing the name of the file. The second is a mode
string, which determines how the file may be used.
There are three
basic modes:
read "r",
write "w" and append "a".
The first opens an
existing file for reading, and fails if the file does not exist.
The other two open
a file for writing, and create a new file if it does not already exist.
Opening an existing
file in "w" mode, first clears the file of its existing data (i.e.,
overwrites the existing file).
Opening in "a"
mode preserves the existing data and adds new data to the end of the file.
è Each of these
modes may include an additional “update” specification signified by a + character
(i.e., "r+", "w+", "a+"), which enables the file
stream to be used for both input and output. This ability is most useful in
conjunction with the random access file operations.
è The standard C
library caters for this variation by permitting a file to be explicitly marked
as binary with the addition of a b character to the file-open mode (e.g., "rb"
opens a binary file for reading).
è If opening a file
is successful, fopen() returns a valid FILE * pointer. If there is an error, it
returns NULL (e.g., attempting to open a file for reading that does not exist,
or attempting to open a file without appropriate permissions). As with other
functions that return pointers to limited resources, such as the dynamic memory
allocation functions, it is prudent to always check the return value for NULL.
è To close a file,
the file pointer is passed to fclose(), which has the interface
int fclose(FILE *fp);
è This function
breaks the connection with the file and frees the file pointer. It is good
practice to free file pointers when a file is no longer needed as most
operating systems have a limit on the number of files that a program may have
open simultaneously. However, fclose() is called automatically for each open
file when a program terminates.
2.2
Standard IO
When a program
begins execution, there are three text streams predefined and open. These are
standard input (stdin)
standard output (stdout)
and
standard error (stderr).
The first two
signify “normal” input and output, and for most interactive environments are
directed to the keyboard and screen, respectively. Their input and output
streams are usually buffered, which means that characters are accumulated in a
queue and sent in packets, minimising expensive system calls.
Buffering may be
controlled by the standard function setbuf(). The stderr stream is reserved for
sending error messages. Like stdout it is typically directed to the screen, but
its output is unbuffered.
2.3
Sequential File Operations
Once a file is
opened, operations on the file—reading or writing—usually negotiate the file in
a sequential manner, from the beginning to the end. The standard library
provides a number of different operations for sequential IO.
The simplest
functions process a file one character at a time. To write a character there
are the functions
int fputc(int c, FILE *fp);
int putc(int c, FILE *fp);
int putchar(int c);
where calling putchar(c)
is equivalent to calling putc(c, stdout). The functions putc() and fputc() are
identical, but putc() is typically implemented as a macro for efficiency. These
functions return the character that was written, or EOF if there was an error
(e.g., the hard disk was full).
To read a character,
there are the functions
int fgetc(FILE *fp);
int getc(FILE *fp);
int getchar(void);
which are
analogous to the character output functions. Calling getchar() is equivalent to
calling getc(stdin), and getc() is usually a macro implementation of fgetc(). These
functions return the next character in the character stream unless either the
end-of-file is reached or an error occurs.
In these anomalous
cases, they return EOF. It is possible to push a character c back onto an input
stream using the function
int ungetc(int c, FILE *fp);
The pushed back
character will be read by the next call to getc() (or getchar() or fscanf(),
etc) on that stream.
Formatted IO can
be performed on files using the functions
int fprintf(FILE *fp, const char *format, ...);
int fscanf(FILE *fp, const char *format, ...);
These functions
are generalisations of printf() and scanf(), which are equivalent to the calls
fprintf(stdout, format, ...) and
fscanf(stdin, format, ...), respectively.
Characters can be
read from a file a line at a time using the function
char *fgets(char *buf, int max, FILE *fp);
which reads at
most max-1 characters from the file pointed to by fp and stores the resulting
string in buf. It automatically appends a \0 character to the end of the
string. The function returns when it encounters a \n character (i.e., a
newline), or reaches the end-of-file, or has read the maximum number of
characters. It returns a pointer to buf if successful, and NULL for end-of-file
or if there was an error.
Character strings
may be written to a file using the function
int fputs(const char *str, FILE *fp);
which returns a
non-negative value if successful and EOF if there was an error. Note, the
string need not contain a \n character, and fputs() will not append one, so
strings may be written to the same line with successive calls.
For reading and
writing binary files, a pair of functions are provided that enable objects to
be passed to and from files directly without first converting them to a character
string. These functions are
size_t fread(void *ptr, size_t size, size_t nobj, FILE
*fp);
size_t fwrite(const void *ptr, size_t size, size_t
nobj, FILE *fp);
and they permit
objects of any type to be read or written, including arrays and structures. For
example, if a structure called Astruct were defined, then an array of such
structures could be written to file as follows.
struct Astruct mystruct[10];
fwrite(&mystruct, sizeof(Astruct), 10, fp);
2.4 Random Access
File Operations
The previous file
IO functions progress through a file sequentially. The standard library also
provides a means to move back and forth within a file to any specified
location. These file positioning functions are
long ftell(FILE *fp);
int fseek(FILE *fp, long offset, int from);
void rewind(FILE *fp);
The first, ftell(), returns the current position
in the file stream. For binary files this value is the number of characters
preceding the current position.
For text files the
value is implementation defined. In both cases the value is in a form suitable
for the second argument of fseek(),
and the value 0L represents the beginning of the file.
The second
function, fseek(), sets the file position to a location specified by its second
argument. This parameter is an offset, which shifts the file position relative
to a given reference location. The reference location is given by the third
argument and may be one of three values as defined by the symbolic constants SEEK_SET, SEEK_CUR, and SEEK_END.
These specify the
beginning of the file, the current file position, and the end of file,
respectively. Having shifted the file position via fseek(), a subsequent read
or write will proceed from this new position.
For binary files, fseek()
may be used to move the file position to any chosen location. For text files,
however, the set of valid operations is restricted to the following.
fseek(fp, 0L, SEEK_SET); /* Move to beginning of file.
*/
fseek(fp, 0L, SEEK_CUR); /* Move to current location
(no effect). */
fseek(fp, 0L, SEEK_END); /* Move to end of file. */
fseek(fp, pos, SEEK_SET); /* Move to pos. */
In the last case,
the value pos must be a position returned by a previous call to ftell(). Binary
files, on the other hand, permit more arbitrary use, such as
fseek(fp, -4L, SEEK_CUR); /* Move back 4 bytes. */
The program below
shows an example of ftell() and fseek() to determine the length of a file in
bytes. The file itself may be plain text, but it is opened as binary so that ftell()
returns the number of characters to the end-of-file.
/* Compute the
length of a file in bytes. From Snippets (ansiflen.c) */
long flength(char *fname)
{
long length = −1L;
FILE *fptr;
fptr = fopen(fname,
"rb");
if (fptr != NULL) {
fseek(fptr, 0L, SEEK
END);
length = ftell(fptr);
fclose(fptr);
}
return length;
}
The third
function, rewind(), returns the
position to the beginning of the file. Calling rewind(fp) is equivalent to the
statement fseek(fp, 0L, SEEK_SET).
Two other file
positioning functions are available in the standard library: fgetpos() and fsetpos().
These perform essentially the same tasks as ftell() and fseek(), respectively,
but are able to handle files too large for their positions to be representable
by a long integer.
3.
Command-Shell Redirection
Often programs are
executed from a command-interpreter environment (also called a shell). Most operating
systems possess such an interpreter. For example, Win32 has a DOS-shell and
UNIX-like systems have various similar shell environments such as the C-shell,
the Bourne-shell, the Korn-shell, etc. Most shells facilitate redirection of stdin
and stdout using the commands < and >, respectively.
Redirection is not
part of the C language, but an operating system service that supports the C
inputoutput model.
#include <stdio.h>
/* Write stdin to
stdout */
int main(void)
{
int c;
while ((c = getchar())
!= EOF)
putchar(c);
}
Consider the
example program above. It simply reads characters from stdin and forwards them to
stdout. Normally this means the characters typed at the keyboard are echoed on
the screen after the user hits the “enter” key. Assume the program executable
is named “repeat”.
repeat
type some text 123
type some text 123
However, a file
may be substituted for the keyboard by redirection.
repeat
<infile.txt
display contents
of infile.txt
Alternatively, a
file may be substituted for the screen, or for both keyboard and screen as in
the following example, which copies the contents of infile.txt to outfile.txt.
repeat
<infile.txt >outfile.txt
Further
redirection commands are >> and |. The former redirects stdout but,
unlike >, appends the redirected output rather than overwriting the existing
file contents. The latter is called a “pipe”, and it directs the stdout of one
program to the stdin of another. For example, prog1 | prog2
prog1 executes
first and its stdout is accumulated in a temporary buffer and, once the program
has terminated, prog2 executes with this set of output as its stdin. The stderr
stream is not redirected, and so will still print messages to the screen even
if stdout is redirected.
No comments:
Post a Comment