What are the performance impacts of 'functional' Rust? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How can I add new methods to Iterator?How can I append a formatted string to an existing String?What is tail recursion?What is 'Currying'?What is a monad?What is the difference between a 'closure' and a 'lambda'?Does functional programming replace GoF design patterns?What is (functional) reactive programming?What is the difference between declarative and imperative programming?Functional programming vs Object Oriented programming“What part of Hindley-Milner do you not understand?”map function for objects (instead of arrays)

Selecting user stories during sprint planning

Maximum summed subsequences with non-adjacent items

Do I really need to have a message in a novel to appeal to readers?

I want to complete my figure

Is there any word for a place full of confusion?

Hangman Game with C++

Do wooden building fires get hotter than 600°C?

What order were files/directories outputted in dir?

Denied boarding although I have proper visa and documentation. To whom should I make a complaint?

I am having problem understanding the behavior of below code in JavaScript

What would you call this weird metallic apparatus that allows you to lift people?

Did any compiler fully use 80-bit floating point?

Should I use a zero-interest credit card for a large one-time purchase?

Disembodied hand growing fangs

How do I find out the mythology and history of my Fortress?

AppleTVs create a chatty alternate WiFi network

How could we fake a moon landing now?

MLE of the unknown radius

"Lost his faith in humanity in the trenches of Verdun" — last line of an SF story

Why does the remaining Rebel fleet at the end of Rogue One seem dramatically larger than the one in A New Hope?

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

Project Euler #1 in C++

Can a new player join a group only when a new campaign starts?

Why do we bend a book to keep it straight?

What are the performance impacts of 'functional' Rust?

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)

Data science time! April 2019 and salary with experience

The Ask Question Wizard is Live!How can I add new methods to Iterator?How can I append a formatted string to an existing String?What is tail recursion?What is 'Currying'?What is a monad?What is the difference between a 'closure' and a 'lambda'?Does functional programming replace GoF design patterns?What is (functional) reactive programming?What is the difference between declarative and imperative programming?Functional programming vs Object Oriented programming“What part of Hindley-Milner do you not understand?”map function for objects (instead of arrays)

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.

I solved the 'run length encoding' problem:

pub fn encode(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval

I noticed that one of the top-rated answers looked more like this:

extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String (c, group)

I love the top rated solution; it is simple, functional, and elegant. This is what they promised me Rust would be all about. Mine on the other hand is gross and full of mutable variables. You can tell I'm used to C++.

My problem is that the functional style has a SIGNIFICANT performance impact. I tested both versions with the same 4MB of random data encoded 1000 times. My imperative solution took under 10 seconds; the functional solution was ~2mins30seconds.

Why is the functional style so much slower than the imperative style?

Is there some problem with the functional implementation which is causing such a huge slowdown?

If I want to write high performance code, should I ever use this functional style?

edited Apr 14 at 13:28

Shepmaster

163k16337483

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

1

The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
Apr 14 at 12:33

1

@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
Apr 14 at 13:17

2

I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
Apr 14 at 13:33

add a comment |

I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.

I solved the 'run length encoding' problem:

pub fn encode(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval

I noticed that one of the top-rated answers looked more like this:

extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String (c, group)

Why is the functional style so much slower than the imperative style?

Is there some problem with the functional implementation which is causing such a huge slowdown?

If I want to write high performance code, should I ever use this functional style?

edited Apr 14 at 13:28

Shepmaster

163k16337483

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

1

The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
Apr 14 at 12:33

1

@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
Apr 14 at 13:17

2

I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
Apr 14 at 13:33

add a comment |

I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.

I solved the 'run length encoding' problem:

pub fn encode(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval

I noticed that one of the top-rated answers looked more like this:

extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String (c, group)

Why is the functional style so much slower than the imperative style?

Is there some problem with the functional implementation which is causing such a huge slowdown?

If I want to write high performance code, should I ever use this functional style?

edited Apr 14 at 13:28

Shepmaster

163k16337483

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.

I solved the 'run length encoding' problem:

pub fn encode(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval

I noticed that one of the top-rated answers looked more like this:

extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String (c, group)

Why is the functional style so much slower than the imperative style?

Is there some problem with the functional implementation which is causing such a huge slowdown?

If I want to write high performance code, should I ever use this functional style?

functional-programming rust imperative-programming

edited Apr 14 at 13:28

Shepmaster

163k16337483

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

edited Apr 14 at 13:28

Shepmaster

163k16337483

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

edited Apr 14 at 13:28

Shepmaster

163k16337483

edited Apr 14 at 13:28

Shepmaster

163k16337483

edited Apr 14 at 13:28

Shepmaster

163k16337483

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

asked Apr 14 at 12:07

David Copernicus Bowie

15816

asked Apr 14 at 12:07

David Copernicus Bowie

15816

New contributor

David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
Apr 14 at 12:33

1

@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
Apr 14 at 13:17

2

I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
Apr 14 at 13:33

add a comment |

1

The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
Apr 14 at 12:33

1

@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
Apr 14 at 13:17

2

I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
Apr 14 at 13:33

The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
Apr 14 at 12:33

@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
Apr 14 at 13:17

I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
Apr 14 at 13:33

add a comment |

2 Answers
2

active

oldest

votes

TL;DR

A functional implementation can be faster than your original procedural implementation, in certain cases.

Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?

As Matthieu M. already pointed out, the important thing to note is that the algorithm matters. How that algorithm is expressed (procedural, imperative, object-oriented, functional, declarative) generally doesn't matter.

I see two main issues with the functional code:

Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.

There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.

Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
 4 (4.00%) high mild
 3 (3.00%) high severe

If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:

struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next(); // See footnote 1
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(),

1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
 3 (3.00%) high mild
 1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
 5 (5.00%) high mild
 6 (6.00%) high severe

I believe this more clearly shows the main fundamental difference between the two implementations: an iterator-based solution is resumable. Every time we call next, we need to see if there was a previous character that we've read (self.saved). This adds a branch to the code that isn't there in the procedural code.

On the flip side, the iterator-based solution is more flexible — we can now compose all sorts of transformations on the data, or write directly to a file instead of a String, etc. The custom iterator can be extended to operate on a generic type instead of char as well, making it very flexible.

See also:

How can I add new methods to Iterator?

If I want to write high performance code, should I ever use this functional style?

I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.

Supporting code

Always got to show your work, right?

benchmark.rs

use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion) 
 let data = rand_data(4 * 1024 * 1024);

 c.bench_function("encode (procedural)", );

 c.bench_function("encode (functional)", );

 c.bench_function("encode (fast)", encode_slim(&data))
 );

 c.bench_function("encode (tiny)", 
 let data = data.clone();
 move );


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

lib.rs

use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String 
 use rand::distributions::Alphanumeric, Distribution;
 let mut rng = rand::thread_rng();
 Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval


pub fn encode_iter(data: &str) -> String (c, group)

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()


struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next();
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(), 

#[cfg(test)]
mod test 
 use super::*;

 #[test]
 fn all_the_same() 
 let data = rand_data(1024);

 let a = encode_proc(&data);
 let b = encode_iter(&data);
 let c = encode_slim(&data);
 let d = encode_tiny(&data);

 assert_eq!(a, b);
 assert_eq!(a, c);
 assert_eq!(a, d);

edited Apr 15 at 1:06

answered Apr 14 at 14:49

Shepmaster

163k16337483

1

Great iterator!

– Matthieu M.
Apr 14 at 17:39

1

By the way, when resuming an iterator introduces a branch, it's possible to implement try_fold directly rather than relying on the default implementation (which calls next). This helps when the optimizer fails to optimize out the branch.

– Matthieu M.
Apr 15 at 8:16

add a comment |

Let's review the functional implementation!

Memory Allocations

One of the big issues of the functional style proposed here is the closure passed to the map method which allocates a lot. Every single character is first mapped to a String before being collected.

It also uses the format machinery, which is known to be relatively slow.

Sometimes, people try way too hard to get a "pure" functional solution, instead:

let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c) 
 let count = group.count();
 if count > 1 
 result.push_str(&count.to_string());
 

 result.push(c);

is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.

I would expect a significant performance win compared to the full functional solution, while at the same time still leveraging group_by for extra readability compared to the full imperative solution. Sometimes, you ought to mix and match!

edited Apr 15 at 1:05

Shepmaster

163k16337483

answered Apr 14 at 12:42

Matthieu M.

207k29285526

That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
Apr 14 at 13:55

1

How can I append a formatted string to an existing String?

– Shepmaster
Apr 14 at 14:57

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55675093%2fwhat-are-the-performance-impacts-of-functional-rust%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

TL;DR

A functional implementation can be faster than your original procedural implementation, in certain cases.

Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?

I see two main issues with the functional code:

Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.

There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.

Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
 4 (4.00%) high mild
 3 (3.00%) high severe

If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:

struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next(); // See footnote 1
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(),

1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
 3 (3.00%) high mild
 1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
 5 (5.00%) high mild
 6 (6.00%) high severe

See also:

How can I add new methods to Iterator?

If I want to write high performance code, should I ever use this functional style?

I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.

Supporting code

Always got to show your work, right?

benchmark.rs

use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion) 
 let data = rand_data(4 * 1024 * 1024);

 c.bench_function("encode (procedural)", );

 c.bench_function("encode (functional)", );

 c.bench_function("encode (fast)", encode_slim(&data))
 );

 c.bench_function("encode (tiny)", 
 let data = data.clone();
 move );


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

lib.rs

use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String 
 use rand::distributions::Alphanumeric, Distribution;
 let mut rng = rand::thread_rng();
 Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval


pub fn encode_iter(data: &str) -> String (c, group)

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()


struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next();
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(), 

#[cfg(test)]
mod test 
 use super::*;

 #[test]
 fn all_the_same() 
 let data = rand_data(1024);

 let a = encode_proc(&data);
 let b = encode_iter(&data);
 let c = encode_slim(&data);
 let d = encode_tiny(&data);

 assert_eq!(a, b);
 assert_eq!(a, c);
 assert_eq!(a, d);

edited Apr 15 at 1:06

answered Apr 14 at 14:49

Shepmaster

163k16337483

1

Great iterator!

– Matthieu M.
Apr 14 at 17:39

1

By the way, when resuming an iterator introduces a branch, it's possible to implement try_fold directly rather than relying on the default implementation (which calls next). This helps when the optimizer fails to optimize out the branch.

– Matthieu M.
Apr 15 at 8:16

add a comment |

TL;DR

A functional implementation can be faster than your original procedural implementation, in certain cases.

Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?

I see two main issues with the functional code:

Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.

There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.

Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
 4 (4.00%) high mild
 3 (3.00%) high severe

If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:

struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next(); // See footnote 1
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(),

1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
 3 (3.00%) high mild
 1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
 5 (5.00%) high mild
 6 (6.00%) high severe

See also:

How can I add new methods to Iterator?

If I want to write high performance code, should I ever use this functional style?

I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.

Supporting code

Always got to show your work, right?

benchmark.rs

use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion) 
 let data = rand_data(4 * 1024 * 1024);

 c.bench_function("encode (procedural)", );

 c.bench_function("encode (functional)", );

 c.bench_function("encode (fast)", encode_slim(&data))
 );

 c.bench_function("encode (tiny)", 
 let data = data.clone();
 move );


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

lib.rs

use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String 
 use rand::distributions::Alphanumeric, Distribution;
 let mut rng = rand::thread_rng();
 Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval


pub fn encode_iter(data: &str) -> String (c, group)

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()


struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next();
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(), 

#[cfg(test)]
mod test 
 use super::*;

 #[test]
 fn all_the_same() 
 let data = rand_data(1024);

 let a = encode_proc(&data);
 let b = encode_iter(&data);
 let c = encode_slim(&data);
 let d = encode_tiny(&data);

 assert_eq!(a, b);
 assert_eq!(a, c);
 assert_eq!(a, d);

edited Apr 15 at 1:06

answered Apr 14 at 14:49

Shepmaster

163k16337483

1

Great iterator!

– Matthieu M.
Apr 14 at 17:39

1

By the way, when resuming an iterator introduces a branch, it's possible to implement try_fold directly rather than relying on the default implementation (which calls next). This helps when the optimizer fails to optimize out the branch.

– Matthieu M.
Apr 15 at 8:16

add a comment |

TL;DR

A functional implementation can be faster than your original procedural implementation, in certain cases.

Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?

I see two main issues with the functional code:

Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.

There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.

Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
 4 (4.00%) high mild
 3 (3.00%) high severe

If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:

struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next(); // See footnote 1
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(),

1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
 3 (3.00%) high mild
 1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
 5 (5.00%) high mild
 6 (6.00%) high severe

See also:

How can I add new methods to Iterator?

If I want to write high performance code, should I ever use this functional style?

I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.

Supporting code

Always got to show your work, right?

benchmark.rs

use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion) 
 let data = rand_data(4 * 1024 * 1024);

 c.bench_function("encode (procedural)", );

 c.bench_function("encode (functional)", );

 c.bench_function("encode (fast)", encode_slim(&data))
 );

 c.bench_function("encode (tiny)", 
 let data = data.clone();
 move );


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

lib.rs

use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String 
 use rand::distributions::Alphanumeric, Distribution;
 let mut rng = rand::thread_rng();
 Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval


pub fn encode_iter(data: &str) -> String (c, group)

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()


struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next();
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(), 

#[cfg(test)]
mod test 
 use super::*;

 #[test]
 fn all_the_same() 
 let data = rand_data(1024);

 let a = encode_proc(&data);
 let b = encode_iter(&data);
 let c = encode_slim(&data);
 let d = encode_tiny(&data);

 assert_eq!(a, b);
 assert_eq!(a, c);
 assert_eq!(a, d);

edited Apr 15 at 1:06

answered Apr 14 at 14:49

Shepmaster

163k16337483

TL;DR

A functional implementation can be faster than your original procedural implementation, in certain cases.

Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?

I see two main issues with the functional code:

Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.

There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.

Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
 4 (4.00%) high mild
 3 (3.00%) high severe

If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:

struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next(); // See footnote 1
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(),

1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.

A benchmark of 4MiB of random alphanumeric data, compiled with RUSTFLAGS='-C target-cpu=native':

encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
 3 (3.00%) high mild
 1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
 5 (5.00%) high mild
 6 (6.00%) high severe

See also:

How can I add new methods to Iterator?

If I want to write high performance code, should I ever use this functional style?

I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.

Supporting code

Always got to show your work, right?

benchmark.rs

use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion) 
 let data = rand_data(4 * 1024 * 1024);

 c.bench_function("encode (procedural)", );

 c.bench_function("encode (functional)", );

 c.bench_function("encode (fast)", encode_slim(&data))
 );

 c.bench_function("encode (tiny)", 
 let data = data.clone();
 move );


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

lib.rs

use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String 
 use rand::distributions::Alphanumeric, Distribution;
 let mut rng = rand::thread_rng();
 Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String 
 let mut retval = String::new();
 let firstchar = source.chars().next();
 let mut currentchar = match firstchar 
 Some(x) => x,
 None => return retval,
 ;
 let mut currentcharcount: u32 = 0;
 for c in source.chars() 
 if c == currentchar 
 currentcharcount += 1;
 else 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 currentchar = c;
 currentcharcount = 1;
 
 
 if currentcharcount > 1 
 retval.push_str(&currentcharcount.to_string());
 
 retval.push(currentchar);
 retval


pub fn encode_iter(data: &str) -> String (c, group)

pub fn encode_slim(data: &str) -> String match count 
 1 => f(&c),
 n => f(&format_args!("", n, c)),
 )
 .to_string()


struct RunLength<I> 
 iter: I,
 saved: Option<char>,


impl<I> RunLength<I>
where
 I: Iterator<Item = char>,

 fn new(mut iter: I) -> Self 
 let saved = iter.next();
 Self iter, saved 
 


impl<I> Iterator for RunLength<I>
where
 I: Iterator<Item = char>,

 type Item = (char, usize);

 fn next(&mut self) -> Option<Self::Item> self.iter.next())?;

 let mut count = 1;
 while let Some(n) = self.iter.next() 
 if n == c 
 count += 1
 else 
 self.saved = Some(n);
 break;
 
 

 Some((c, count))
 


pub fn encode_tiny(data: &str) -> String 
 use std::fmt::Write;

 RunLength::new(data.chars()).fold(String::new(), 

#[cfg(test)]
mod test 
 use super::*;

 #[test]
 fn all_the_same() 
 let data = rand_data(1024);

 let a = encode_proc(&data);
 let b = encode_iter(&data);
 let c = encode_slim(&data);
 let d = encode_tiny(&data);

 assert_eq!(a, b);
 assert_eq!(a, c);
 assert_eq!(a, d);

edited Apr 15 at 1:06

answered Apr 14 at 14:49

Shepmaster

163k16337483

edited Apr 15 at 1:06

answered Apr 14 at 14:49

Shepmaster

163k16337483

answered Apr 14 at 14:49

Shepmaster

163k16337483

answered Apr 14 at 14:49

Shepmaster

163k16337483

1

Great iterator!

– Matthieu M.
Apr 14 at 17:39

1

By the way, when resuming an iterator introduces a branch, it's possible to implement try_fold directly rather than relying on the default implementation (which calls next). This helps when the optimizer fails to optimize out the branch.

– Matthieu M.
Apr 15 at 8:16

add a comment |

1

Great iterator!

– Matthieu M.
Apr 14 at 17:39

1

By the way, when resuming an iterator introduces a branch, it's possible to implement try_fold directly rather than relying on the default implementation (which calls next). This helps when the optimizer fails to optimize out the branch.

– Matthieu M.
Apr 15 at 8:16

Great iterator!

– Matthieu M.
Apr 14 at 17:39

By the way, when resuming an iterator introduces a branch, it's possible to implement try_fold directly rather than relying on the default implementation (which calls next). This helps when the optimizer fails to optimize out the branch.

– Matthieu M.
Apr 15 at 8:16

add a comment |

Let's review the functional implementation!

Memory Allocations

It also uses the format machinery, which is known to be relatively slow.

Sometimes, people try way too hard to get a "pure" functional solution, instead:

let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c) 
 let count = group.count();
 if count > 1 
 result.push_str(&count.to_string());
 

 result.push(c);

is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.

edited Apr 15 at 1:05

Shepmaster

163k16337483

answered Apr 14 at 12:42

Matthieu M.

207k29285526

That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
Apr 14 at 13:55

1

How can I append a formatted string to an existing String?

– Shepmaster
Apr 14 at 14:57

add a comment |

Let's review the functional implementation!

Memory Allocations

It also uses the format machinery, which is known to be relatively slow.

Sometimes, people try way too hard to get a "pure" functional solution, instead:

let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c) 
 let count = group.count();
 if count > 1 
 result.push_str(&count.to_string());
 

 result.push(c);

is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.

edited Apr 15 at 1:05

Shepmaster

163k16337483

answered Apr 14 at 12:42

Matthieu M.

207k29285526

That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
Apr 14 at 13:55

1

How can I append a formatted string to an existing String?

– Shepmaster
Apr 14 at 14:57

add a comment |

Let's review the functional implementation!

Memory Allocations

It also uses the format machinery, which is known to be relatively slow.

Sometimes, people try way too hard to get a "pure" functional solution, instead:

let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c) 
 let count = group.count();
 if count > 1 
 result.push_str(&count.to_string());
 

 result.push(c);

is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.

edited Apr 15 at 1:05

Shepmaster

163k16337483

answered Apr 14 at 12:42

Matthieu M.

207k29285526

Let's review the functional implementation!

Memory Allocations

It also uses the format machinery, which is known to be relatively slow.

Sometimes, people try way too hard to get a "pure" functional solution, instead:

let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c) 
 let count = group.count();
 if count > 1 
 result.push_str(&count.to_string());
 

 result.push(c);

is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.

edited Apr 15 at 1:05

Shepmaster

163k16337483

answered Apr 14 at 12:42

Matthieu M.

207k29285526

edited Apr 15 at 1:05

Shepmaster

163k16337483

edited Apr 15 at 1:05

Shepmaster

163k16337483

edited Apr 15 at 1:05

Shepmaster

163k16337483

answered Apr 14 at 12:42

Matthieu M.

207k29285526

answered Apr 14 at 12:42

Matthieu M.

207k29285526

answered Apr 14 at 12:42

Matthieu M.

207k29285526

That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
Apr 14 at 13:55

1

How can I append a formatted string to an existing String?

– Shepmaster
Apr 14 at 14:57

add a comment |

That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
Apr 14 at 13:55

1

How can I append a formatted string to an existing String?

– Shepmaster
Apr 14 at 14:57

That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
Apr 14 at 13:55

How can I append a formatted string to an existing String?

– Shepmaster
Apr 14 at 14:57

add a comment |

David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ygtjki

2 Answers
2

Supporting code

Memory Allocations

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Supporting code

Supporting code

Supporting code

Supporting code

Memory Allocations

Memory Allocations

Memory Allocations

Memory Allocations

Post as a guest

Popular posts from this blog

Àrd-bhaile Cathair chruinne/Baile mòr cruinne | Artagailean ceangailte | Clàr-taice na seòladaireachd

2 Answers 2

Supporting code

Memory Allocations

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Supporting code

Supporting code

Supporting code

Supporting code

Memory Allocations

Memory Allocations

Memory Allocations

Memory Allocations

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Àrd-bhaile Cathair chruinne/Baile mòr cruinne | Artagailean ceangailte | Clàr-taice na seòladaireachd

2 Answers
2

2 Answers
2

2 Answers
2