Tree difference algorithm does not handle unnormalized Unicode names correctly (Tree listing does it correctly, however). Here is how to reproduce:
git clone git://github.com/dotnet/cli /tmp/cli
go run bug.go /tmp/cli
bug.go:
package main
import (
"os"
"strings"
"gopkg.in/src-d/go-git.v4"
"gopkg.in/src-d/go-git.v4/plumbing"
"gopkg.in/src-d/go-git.v4/plumbing/object"
)
func main() {
r, err := git.PlainOpen(os.Args[1])
if err != nil {
panic(err)
}
c, err := r.CommitObject(plumbing.NewHash("55c59d621ea22921ecaabd99266d45a7921aab70"))
if err != nil {
panic(err)
}
t1, err := c.Tree()
if err != nil {
panic(err)
}
t1, err = t1.Tree("TestAssets/TestProjects")
if err != nil {
panic(err)
}
files := map[string]bool{}
t1.Files().ForEach(func(f *object.File) error {
files[f.Name] = true
return nil
})
c, err = r.CommitObject(plumbing.NewHash("6fcbefa4f7a0016a68d3cda52779298a5cd20837"))
if err != nil {
panic(err)
}
t2, err := c.Tree()
if err != nil {
panic(err)
}
t2, err = t2.Tree("TestAssets/TestProjects")
if err != nil {
panic(err)
}
diff, err := object.DiffTree(t1, t2)
for _, d := range diff {
if strings.HasPrefix(d.To.Name, "TestAppWithUnico") &&
strings.HasSuffix(d.To.Name, "Program.cs") {
println(d.String())
println(files[d.To.Name])
}
}
}
We see:
<Action: Insert, Path: TestAppWithUnicodéPath/Program.cs>
true
The expected output is empty.
Here is what is happening. 55c59d621ea22921ecaabd99266d45a7921aab70 and 6fcbefa4f7a0016a68d3cda52779298a5cd20837 are two consecutive commits.
cd /tmp/cli
git checkout 55c59d621ea22921ecaabd99266d45a7921aab70
echo TestAssets/TestProjects/TestAppWithUni*
git checkout 6fcbefa4f7a0016a68d3cda52779298a5cd20837
echo TestAssets/TestProjects/TestAppWithUni*
Output:
TestAssets/TestProjects/TestAppWithUnicodéPath
TestAssets/TestProjects/TestAppWithUnicodéPath TestAssets/TestProjects/TestAppWithUnicodéPath
There are two almost identical directories. One is in normalized Unicode, the other is not.
Tree difference algorithm does not handle unnormalized Unicode names correctly (Tree listing does it correctly, however). Here is how to reproduce:
bug.go:
We see:
The expected output is empty.
Here is what is happening. 55c59d621ea22921ecaabd99266d45a7921aab70 and 6fcbefa4f7a0016a68d3cda52779298a5cd20837 are two consecutive commits.
Output:
There are two almost identical directories. One is in normalized Unicode, the other is not.