Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.2k views
in Technique[技术] by (71.8m points)

c++ - std::remove_if GCC implementation isn't efficient?

From another question here there seems to be evidence, that GCC's implementation of std::remove_if doesn't provide equally efficiency compared to the following implementation:

'raw homebrew' solution:

static char str1[100] = "str,, ing";
size_t size = sizeof(str1)/sizeof(str1[0]);

int bad = 0;
int cur = 0;
while (str1[cur] != '') {
    if (bad < cur && !ispunct(str1[cur]) && !isspace(str1[cur])) {
        str1[bad] = str1[cur];
    }
    if (ispunct(str1[cur]) || isspace(str1[cur])) {
        cur++;
    } else {
        cur++;
        bad++;
    }
}
str1[bad] = '';

Timing outputs:

0.106860

Sample benchmarking code for std::remove_if for a solution of the same problem:

bool is_char_category_in_question(const char& c) {
    return std::ispunct(c) || std::isspace(c);
}

std::remove_if(&str1[0], &str1[size-1], is_char_category_in_question);

Timing outputs:

1.986838

Check and get actual runtime results for the code running the ideone links above please (giving the full codes here would obscure the question!).

Given the provided execution time results (from the samples), these seem to confirm the first implementation is having much better performance.

Can anyone tell reasons, why the std::remove_if() algorithm doesn't (or can't) provide a similarly efficient solution for the given problem?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Looks to me as though you're running remove_if on a range of 100 characters since size is 100, but the "homebrew" runs until you find the nul terminator (which is only 10 characters in).

Dealing with that using the change in your comment below, on GCC with -O2 I still see a difference of about a factor of 2, with remove_if being slower. Changing to:

struct is_char_category_in_question {
    bool operator()(const char& c) const {
        return std::ispunct(c) || std::isspace(c);
    }
};

gets rid of almost all of this difference, although there may still be a <10% difference. So that looks to me like a quality of implementation issue, whether or not the test gets inlined although I haven't checked the assembly to confirm.

Since your test harness means that no characters are actually removed after the first pass, I'm not troubled by a 10% difference. I'm a bit surprised, but not enough to really get into it. YMMV :-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share

2.1m questions

2.1m answers

63 comments

56.5k users

...