-
-
Notifications
You must be signed in to change notification settings - Fork 930
Closed
Description
I want to parse only the outer tr and td elements of a table and ignore the inner tables. My method is to find the outer tr elements and with that selection, find the td elements. I should find 2 tr elements and 3 td elements.
In the example below, I don't understand why the last method of using .start > tbody > tr > td in the row selection works to find the 3 outer td elements. Doesn't Find only search descendants? The element with the start class and the tbody element are parents of the row selection, right?
package main
import (
"fmt"
"log"
"strings"
"github.com/PuerkitoBio/goquery"
)
var data = `
<!DOCTYPE html>
<html>
<body>
<table class="start">
<tbody>
<tr>
<td>test1</td>
<td>test2</td>
</tr>
<tr>
<td>
<table>
<tbody>
<tr>
<td>test3</td>
<td>test4</td>
</tr>
<tr>
<td>test5</td>
<td>test6</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</body>
</html>
`
func main() {
doc, err := goquery.NewDocumentFromReader(strings.NewReader(data))
if err != nil {
log.Fatal(err)
}
// find outer tr
rowSelection := doc.Find(".start > tbody > tr")
fmt.Println(len(rowSelection.Nodes))
// finds all td
colSelection := rowSelection.Find("td")
fmt.Println(len(colSelection.Nodes))
// finds all td
colSelection = rowSelection.Find("tr > td")
fmt.Println(len(colSelection.Nodes))
// finds no td
colSelection = rowSelection.Find("> td")
fmt.Println(len(colSelection.Nodes))
// finds outer td
colSelection = rowSelection.Find(".start > tbody > tr > td")
fmt.Println(len(colSelection.Nodes))
}
2
7
7
0
3
Metadata
Metadata
Assignees
Labels
No labels