Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
204 views
in Technique[技术] by (71.8m points)

c++ - Why does std::getline() skip input after a formatted extraction?

I have the following piece of code that prompts the user for their name and state:

#include <iostream>
#include <string>

int main()
{
    std::string name;
    std::string state;

    if (std::cin >> name && std::getline(std::cin, state))
    {
        std::cout << "Your name is " << name << " and you live in " << state;
    }
}

What I find is that the name has been successfully extracted, but not the state. Here is the input and resulting output:

Input:

"John"
"New Hampshire"

Output:

"Your name is John and you live in "

Why has the name of the state been omitted from the output? I've given the proper input, but the code somehow ignores it. Why does this happen?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Why does this happen?

This has little to do with the input you provided yourself but rather with the default behavior std::getline() exhibits. When you provided your input for the name (std::cin >> name), you not only submitted the following characters, but also an implicit newline was appended to the stream:

"John
"

A newline is always appended to your input when you select Enter or Return when submitting from a terminal. It is also used in files for moving toward the next line. The newline is left in the buffer after the extraction into name until the next I/O operation where it is either discarded or consumed. When the flow of control reaches std::getline(), the newline will be discarded, but the input will cease immediately. The reason this happens is because the default functionality of this function dictates that it should (it attempts to read a line and stops when it finds a newline).

Because this leading newline inhibits the expected functionality of your program, it follows that it must be skipped or ignored somehow. One option is to call std::cin.ignore() after the the first extraction. It will discard the next available character so that the newline is no longer in the way.

std::getline(std::cin.ignore(), state)

In-Depth Explanation:

This is the overload of std::getline() that you called:

template<class charT>
std::basic_istream<charT>& getline( std::basic_istream<charT>& input,
                                    std::basic_string<charT>& str )

Another overload of this function takes a delimiter of type charT. A delimiter character is a character that represents the boundary between sequences of input. This particular overload sets the delimiter to the newline character input.widen(' ') by default since one was not supplied.

Now, these are a few of the conditions whereby std::getline() terminates input:

  • If the stream has extracted the maximum amount of characters a std::basic_string<charT> can hold
  • If the end-of-file (EOF) character has been found
  • If the delimiter has been found

The third condition is the one we're dealing with. Your input into state is represented thusly:

"John
New Hampshire"
     ^
     |
 next_pointer

where next_pointer is the next character to be parsed. Since the character stored at the next position in the input sequence is the delimiter, std::getline() will quietly discard that character, increment next_pointer to the next available character, and stop input. This means that the rest of the characters that you have provided still remain in the buffer for the next I/O operation. You'll notice that if you perform another read from the line into state, your extraction will yield the correct result as the last call to std::getline() discarded the delimiter.


You may have noticed that you don't typically run into this problem when extracting with the formatted input operator (operator>>()). This is because input streams use whitespace as delimiters for input and have the std::skipws1 manipulator set on by default. Streams will discard the leading whitespace from the stream when beginning to perform formatted input.2

Unlike the formatted input operators, std::getline() is an unformatted input function. And all unformatted input functions have the following code somewhat in common:

typename std::basic_istream<charT>::sentry ok(istream_object, true);

The above is a sentry object which is instantiated in all formatted/unformatted I/O functions in a standard C++ implementation. Sentry objects are used for preparing the stream for I/O and determining whether or not it is in a fail state. You'll only find that in the unformatted input functions, the second argument to the sentry constructor is true. That argument means that leading whitespace will not be discarded from the beginning of the input sequence. Here is the relevant quote from the Standard [§27.7.2.1.3/2]:

 explicit sentry(basic_istream<charT, traits>& is, bool noskipws = false);

[...] If noskipws is zero and is.flags() & ios_base::skipws is nonzero, the function extracts and discards each character as long as the next available input character c is a whitespace character. [...]

Since the above condition is false, the sentry object will not discard the whitespace. The reason noskipws is set to true by this function is because the point of std::getline() is to read raw, unformatted characters into a std::basic_string<charT> object.


The Solution:

There's no way to stop this behavior of std::getline(). What you'll have to do is discard the new line yourself before std::getline() runs (but do it after the formatted extraction). This can be done by using ignore() to discard the rest of the input until we reach a fresh new line:

if (std::cin >> name &&
    std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '
') &&
    std::getline(std::cin, state))
{ ... }

You'll need to include <limits> to use std::numeric_limits. std::basic_istream<...>::ignore() is a function that discards a specified amount of characters until it either finds a delimiter or reaches the end of the stream (ignore() also discards the delimiter if it finds it). The max() function returns the largest amount of characters that a stream can accept.

Another way to discard the whitespace is to use the std::ws function which is a manipulator designed to extract and discard leading whitespace from the beginning of an input stream:

if (std::cin >> name && std::getline(std::cin >> std::ws, state))
{ ... }

What's the difference?

The difference is that ignore(std::streamsize count = 1, int_type delim = Traits::eof())3 indiscriminately discards characters until it either discards count characters, finds the delimiter (specified by the second argument delim) or hits the end of the stream. std::ws is only used for discarding whitespace characters from the beginning of the stream.

If you are mixing formatted input with unformatted input and you need to discard residual whitespace, use std::ws. Otherwise, if you need to clear out invalid input regardless of what it is, use ignore(). In our example, we only need to clear whitespace since the stream consumed your input of "John" for the name variable. All that was left was the newline character.


1: std::skipws is manipulator that tells the input stream to discard leading whitespace when performing formatted input. This can be turned off with the std::noskipws manipulator.

2: Input streams deem certain characters as whitespace by default, such the space character, newline character, form feed, carriage return, etc.

3: This is the signature of std::basic_istream<...>::ignore(). You can call it with zero arguments to discard a single character from the stream, one argument to discard a certain amount of characters, or two arguments to discard count characters or until it reaches delim, whichever one comes first. You normally use std::numeric_limits<std::streamsize>::max() as the value of count if you don't know how many characters there are before the delimiter, but you want to discard them anyway.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...