Go Blog: Go slice internals 学习笔记

# Slice internals

A slice is a descriptor of an array segment. It consists of a pointer to the array, the length of the segment, and its capacity (the maximum length of the segment). Our variable s, created earlier by make([]byte, 5), is structured like this: The length is the number of elements referred to by the slice. The capacity is the number of elements in the underlying array (beginning at the element referred to by the slice pointer). The distinction between length and capacity will be made clear as we walk through the next few examples.

As we slice s, observe the changes in the slice data structure and their relation to the underlying array:

 ``````1 `````` ``s = s[2:4]`` Slicing does not copy the slice’s data. It creates a new slice value that points to the original array. This makes slice operations as efficient as manipulating array indices. Therefore, modifying the elements (not the slice itself) of a re-slice modifies the elements of the original slice:

 ``````1 2 3 4 5 6 `````` ``````d := []byte{'r', 'o', 'a', 'd'} e := d[2:] // e == []byte{'a', 'd'} e = 'm' // e == []byte{'a', 'm'} // d == []byte{'r', 'o', 'a', 'm'}``````

Note: e 是通过在切片 d 上执行 `切片操作[start:end]` 而得到的新切片，但修改 e 的某个元素也会修改原始切片的元素的

Earlier we sliced s to a length shorter than its capacity. We can grow s to its capacity by slicing it again:

 ``````1 2 3 4 5 `````` ``````// Earlier code is: s = s[2:4] // Now s = s[:cap(s)]`````` A slice cannot be grown beyond its capacity. Attempting to do so will cause a runtime panic, just as when indexing outside the bounds of a slice or array. Similarly, slices cannot be re-sliced below zero to access earlier elements in the array.

# Growing slices(the copy and append functions)

To increase the capacity of a slice one must create a new, larger slice and copy the contents of the original slice into it. This technique is how dynamic array implementations from other languages work behind the scenes. The next example doubles the capacity of s by making a new slice, t, copying the contents of s into t, and then assigning the slice value t to s:

 ``````1 2 3 4 5 6 7 `````` ``````t := make([]byte, len(s), (cap(s)+1) * 2) // +1 in case cap(s) == 0 for i := range s { t[i] = s[i] } s = t``````

The looping piece of this common operation is made easier by the built-in copy function. As the name suggests, copy copies data from a source slice to a destination slice. It returns the number of elements copied.

 ``````1 `````` ``func copy(dst, src []T) int``

The copy function supports copying between slices of different lengths (it will copy only up to the smaller number of elements). In addition, copy can handle source and destination slices that share the same underlying array, handling overlapping slices correctly.

`copy` 函数支持在不同长度的切片间进行拷贝(仅支持拷贝小于等于原slice长度的数据)。另外，拷贝函数可以处理共享了底层数组的原切片和目的切片，并且会很好的处理重叠的部分.

Using copy, we can simplify the code snippet above:

 ``````1 2 3 `````` ``````t := make([]byte, len(s), (cap(s)+1)*2) copy(t, s) s = t``````

A common operation is to append data to the end of a slice. This function appends byte elements to a slice of bytes, growing the slice if necessary, and returns the updated slice value:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 `````` ``````func AppendByte(slice []byte, data ...byte) []byte { m := len(slice) n := m + len(data) if n > cap(slice) { // if necessary, reallocate // allocate double what's needed, for future growth. newSlice := make([]byte, (n+1)*2) copy(newSlice, slice) slice = newSlice } slice = slice[0:n] copy(slice[m:n], data) return slice }``````

One could use AppendByte like this:

 ``````1 2 3 4 `````` ``````p := []byte{2, 3, 5} p = AppendByte(p, 7, 11, 13) // p == []byte{2, 3, 5, 7, 11, 13}``````

Functions like AppendByte are useful because they offer complete control over the way the slice is grown. Depending on the characteristics of the program, it may be desirable to allocate in smaller or larger chunks, or to put a ceiling on the size of a reallocation.

But most programs don’t need complete control, so Go provides a built-in append function that’s good for most purposes; it has the signature

 ``````1 `````` ``func append(s []T, x ...T) []T``

The append function appends the elements x to the end of the slice s, and grows the slice if a greater capacity is needed.

 ``````1 2 3 4 `````` ``````a := make([]int, 1) // a == []int{0} a = append(a, 1, 2, 3) // a == []int{0, 1, 2, 3}``````

To append one slice to another, use … to expand the second argument to a list of arguments.

 ``````1 2 3 4 `````` ``````a := []string{"John", "Paul"} b := []string{"George", "Ringo", "Pete"} a = append(a, b...) // equivalent to "append(a, b, b, b)" // a == []string{"John", "Paul", "George", "Ringo", "Pete"}``````

Since the zero value of a slice (nil) acts like a zero-length slice, you can declare a slice variable and then append to it in a loop:

 `````` 1 2 3 4 5 6 7 8 9 10 11 `````` ``````// Filter returns a new slice holding only // the elements of s that satisfy f() func Filter(s []int, fn func(int) bool) []int { var p []int // == nil for _, v := range s { if fn(v) { p = append(p, v) } } return p }``````

# A possible “gotcha”

As mentioned earlier, re-slicing a slice doesn’t make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.

For example, this FindDigits function loads a file into memory and searches it for the first group of consecutive numeric digits, returning them as a new slice.

 ``````1 2 3 4 5 6 `````` ``````var digitRegexp = regexp.MustCompile("[0-9]+") func FindDigits(filename string) []byte { b, _ := ioutil.ReadFile(filename) return digitRegexp.Find(b) }``````

This code behaves as advertised, but the returned []byte points into an array containing the entire file. Since the slice references the original array, as long as the slice is kept around the garbage collector can’t release the array; the few useful bytes of the file keep the entire contents in memory.

To fix this problem one can copy the interesting data to a new slice before returning it:

 ``````1 2 3 4 5 6 7 8 `````` ``````func CopyDigits(filename string) []byte { b, _ := ioutil.ReadFile(filename) b = digitRegexp.Find(b) c := make([]byte, len(b)) copy(c, b) return c }``````

 ``````1 2 3 4 5 6 7 `````` ``````func CopyDigits(filename string) []byte { b, _ := ioutil.ReadFile(filename) b = digitRegexp.Find(b) c := make([]byte, len(b)) return append(c, b...) }``````

Effective Go contains an in-depth treatment(探讨) of slices and arrays, and the Go language specification defines slices and their associated helper functions.

## Effective Go | Slices

Slices wrap arrays to give a more general, powerful, and convenient interface to sequences of data. Except for items with explicit dimension such as transformation matrices, most array programming in Go is done with slices rather than simple arrays.

Slices hold references to an underlying array, and if you assign one slice to another, both refer to the same array. If a function takes a slice argument, changes it makes to the elements of the slice will be visible to the caller, analogous to passing a pointer to the underlying array(类似于传递了一个指向底层数组的指针). A `Read` function can therefore accept a slice argument rather than a pointer and a count; the length within the slice sets an upper limit of how much data to read. Here is the signature of the `Read` method of the File type in package os:

 ``````1 `````` ``func (f *file) Read(buf []byte) (n int, err error)``

The method returns the number of bytes read and an error value, if any(如果有的话). To read into the first 32 bytes of a larger buffer buf, slice (here used as a verb) the buffer.

 ``````1 `````` ``n, err := f.Read(buf[0:32])``

Such slicing is common and efficient. In fact, leaving efficiency aside for the moment, the following snippet would also read the first 32 bytes of the buffer.

 `````` 1 2 3 4 5 6 7 8 9 10 `````` ``````var n int var err error for i := 0; i < 32; i++ { nbytes, e := f.Read(buf[i:i+1]) // 读取一个字节 if nbytes == 0 || e != nil { err = e break } n += nbytes }``````

The length of a slice may be changed as long as it still fits within the limits of the underlying array; just assign it to a slice of itself. The capacity of a slice, accessible by the built-in function cap, reports the maximum length the slice may assume. Here is a function to append data to a slice. If the data exceeds the capacity, the slice is reallocated. The resulting slice is returned. The function uses the fact that len and cap are legal when applied to the nil slice, and return 0.

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 `````` ``````func Append(slice, data []byte) []byte { l := len(slice) if l + len(data) > cap(slice) { // 重新分配 // 为了后面的增长，需分配两份。 newSlice := make([]byte, (l + len(data)*2)) copy(newSlice, slice) slice = newSlice } slice = slice[0:l+len(data)] for i, c := range data { slice[l+i] = c } return slice }``````

We must return the slice afterwards because, although Append can modify the elements of slice, the slice itself (the run-time data structure holding the pointer, length, and capacity) is passed by value.

The idea of appending to a slice is so useful it’s captured by the append built-in function. To understand that function’s design, though, we need a little more information, so we’ll return to it later.