Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.8k views
in Technique[技术] by (71.8m points)

How can I clone a strings.Builder in Go?

The Go programming language's standard library exposes a struct called strings.Builder which allows for easy building of strings through repeated concatenation in an efficient way, similar to C# or Java's StringBuilder.

In Java I would use StringBuilder's constructor to "clone" the object, like this:

StringBuilder newBuffer = new StringBuilder(oldBuffer.toString());

in Go, I can only see the following two-line way:

newBuffer := strings.Builder{}
newBuffer.WriteString(oldBuffer.String())

and no other .Clone() an initializer method (which I might have just not found yet).

Is there another way that would be more brief/concise than the one I have presented?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Going into unnecessary detail for curiosity's sake...

After considering the documentation, here are your main issues:

  1. The only exported way to read data out of the Builder is the Builder.String method.
  2. It is not safe to copy a Builder value once you have manipulated it.

Let's look at this version:

newBuffer := strings.Builder{}
newBuffer.WriteString(oldBuffer.String())

My first thought about why this isn't desirable is because the Builder internally uses a byte slice (mutable data type), and returns a string (immutable data type). Even though a string's underlying representation is the same as a byte slice, due to this mutability rule it would require a copy to convert to string. This means that by the time you write the string to the new buffer, you're already on your second copy when your task intuitively only requires a single copy.

Actually taking a look at the source code, however, we'll see that this assumption is wrong:

func (b *Builder) String() string {
    return *(*string)(unsafe.Pointer(&b.buf))
}

Using the unsafe package, the strings package basically "hacks" the buffer ([]byte) directly into a `string. Again, these data types are the same on a memory level: A pointer to the start of the string or slice, and a pointer offset describing how many bytes long the string or slice is. These data types are just headers, so no copying of the buffer has occurred here.

This creates the uncomfortable situation where you have a string which is supposed to be immutable, but you still have a byte slice somewhere that could mutate those underlying bytes. The package is called unsafe after all, and this is a good example of why that is.

Because the strings.Builder is purely a "builder", i.e. it can only create new parts of the string and never modify data that's already written, we still get the immutability of our string that the language "guarantees". The only way we can break that rule is by gaining access to the internal buf of the Builder, but as that field is un-exported, you would again need to employ unsafe yourself to access it.

Summary:

The straightforward method you came up with, while perhaps a line (or two) longer than one might hope for, it is the definitive and correct way to do it. It's already as efficient as you're going to get, even if you bring out the more gritty features of Go like unsafe and reflect.

I hope that this has been informative. Here are the only suggested changes to your code:

// clone the builder contents. this is fast.
newBuffer := strings.Builder{}
newBuffer.WriteString(oldBuffer.String())

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...