Skip to content

Add databricks fs ls command for WSFS#408

Closed
shreyas-goenka wants to merge 8 commits intomainfrom
lscmd
Closed

Add databricks fs ls command for WSFS#408
shreyas-goenka wants to merge 8 commits intomainfrom
lscmd

Conversation

@shreyas-goenka
Copy link
Copy Markdown
Contributor

@shreyas-goenka shreyas-goenka commented May 26, 2023

Tests

Manually

@shreyas-goenka
Copy link
Copy Markdown
Contributor Author

Output:

shreyas.goenka@THW32HFW6T bricks % cli fs ls /Users/[email protected] --absolute
/Users/[email protected]/sync-fail
/Users/[email protected]/bailando
/Users/[email protected]/.sdk
/Users/[email protected]/x.py
/Users/[email protected]/.ide
/Users/[email protected]/legacy-cli
/Users/[email protected]/new-cli
/Users/[email protected]/foo
/Users/[email protected]/[deco-438] valid notebook
/Users/[email protected]/[deco-438] invalid notebook
/Users/[email protected]/.bundle
/Users/[email protected]/job-output
shreyas.goenka@THW32HFW6T bricks % cli fs ls /Users/[email protected] --absolute -l
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/sync-fail
NOTEBOOK   0  Wed Apr 26 14:47:24 UTC 2023  /Users/[email protected]/bailando
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/.sdk
FILE       0  Wed Mar  8 22:37:17 UTC 2023  /Users/[email protected]/x.py
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/.ide
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/legacy-cli
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/new-cli
NOTEBOOK   0  Wed Apr 26 14:44:38 UTC 2023  /Users/[email protected]/foo
NOTEBOOK   0  Tue Mar 21 16:40:29 UTC 2023  /Users/[email protected]/[deco-438] valid notebook
NOTEBOOK   0  Wed Mar  8 18:10:53 UTC 2023  /Users/[email protected]/[deco-438] invalid notebook
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/.bundle
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  /Users/[email protected]/job-output
shreyas.goenka@THW32HFW6T bricks % cli fs ls /Users/[email protected] -l
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  sync-fail
NOTEBOOK   0  Wed Apr 26 14:47:24 UTC 2023  bailando
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  .sdk
FILE       0  Wed Mar  8 22:37:17 UTC 2023  x.py
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  .ide
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  legacy-cli
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  new-cli
NOTEBOOK   0  Wed Apr 26 14:44:38 UTC 2023  foo
NOTEBOOK   0  Tue Mar 21 16:40:29 UTC 2023  [deco-438] valid notebook
NOTEBOOK   0  Wed Mar  8 18:10:53 UTC 2023  [deco-438] invalid notebook
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  .bundle
DIRECTORY  0  Thu Jan  1 00:00:00 UTC 1970  job-output
shreyas.goenka@THW32HFW6T bricks % cli fs ls /Users/[email protected]
sync-fail
bailando
.sdk
x.py
.ide
legacy-cli
new-cli
foo
[deco-438] valid notebook
[deco-438] invalid notebook
.bundle
job-output
s

@shreyas-goenka
Copy link
Copy Markdown
Contributor Author

TODO: add integration test

Copy link
Copy Markdown
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also include the integration test for the filer code (there are existing filer tests).

return fmt.Errorf("invalid output format: %s", c.outputFormat)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set the "template" annotation on the command to achieve the same result. Then this function doesn't need to be added.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried that first, does not work because the cmdIO object is initialized in RootCmd, so overriding annotation does not work

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shreyas-goenka please add this as a comment in the command and to this function.

Also, please make Render to call RenderWithTemplate with c.template, otherwise we have two code paths for the same thing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in followup

}

// Get file info
filesInfo, err := f.ReadDir(ctx, "")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should pass path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well no right? since the filer is rooted at the arg[0] already

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, you're right. Could you modify this to . to make it explicit, and include a comment saying so?

output := make([]map[string]string, 0)
for _, info := range filesInfo {
output = append(output, parseFileInfo(info, path, absolute))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could alias the FileInfo struct and add a couple helper functions to avoid creating a map[string]string. Those helpers would then also be more easily testable.

Copy link
Copy Markdown
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please augment the filer integration test to cover the additional functions.

I notice the path.Base call there, implying the returned paths are relative to the specified path, not the filer root. This needs to be covered by an integration test as it is ambiguous otherwise.

There is overlap between the filer integration test and the new workspace testdata helpers. Can you consolidate this in a follow up PR? I'm not keen on the proliferation of duplicate test code.


// Make API request
err = apiClient.Do(ctx, http.MethodPost, urlPath, content, nil)
require.NoError(w.t, err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to what filer.Filer does. Please reuse that instead of a separate impl.


func assertObjectListed(t *testing.T, parsedLogs []map[string]string, name string, objectType string) {
foundFile := false
for _, v := range parsedLogs {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are these parsed logs? Looking at the code I see []filer.FileInfo.

err := json.Unmarshal(stdout.Bytes(), &parsedLogs)
require.NoError(t, err)

// make assertions on the output logs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to just "output". The "logs" suffix confuses it with our logging infrastructure.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, we need a couple of type-safe CLI run test helpers

pietern added a commit that referenced this pull request May 31, 2023
This cherry-picks the filer changes from #408.
pietern added a commit that referenced this pull request May 31, 2023
## Changes

This cherry-picks the filer changes from #408.

## Tests

Manually ran integration tests.
`)
if longMode {
template = cmdio.Heredoc(`
{{range .}}{{.Type|printf "%-10s"}} {{.Size}} {{.ModTime|unix_date}} {{.Name}}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we support just the long mode? then the code is more maintainable and template ends up in the annotation

return fmt.Errorf("invalid output format: %s", c.outputFormat)
}
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shreyas-goenka please add this as a comment in the command and to this function.

Also, please make Render to call RenderWithTemplate with c.template, otherwise we have two code paths for the same thing

// Initialize workspace client
ctx := cmd.Context()
w := root.WorkspaceClient(ctx)
f, err := filer.NewWorkspaceFilesClient(w, rootPath)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

databricks fs lists DBFS, not the workspace. why do we use workspace files?

err := json.Unmarshal(stdout.Bytes(), &parsedLogs)
require.NoError(t, err)

// make assertions on the output logs
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, we need a couple of type-safe CLI run test helpers

return w.root
}

func (w *workspaceTestdata) AddFile(name string, content string) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a test for DBFS?

return string(b), nil
},
"unix_date": func(t time.Time) string {
return t.UTC().Format(time.UnixDate)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've looked at the output - it's very difficult to parse when piped through aws. use the 2006-01-02T15:04:05Z as a format.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in the followup PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in the followup PR

@pietern
Copy link
Copy Markdown
Contributor

pietern commented Jun 5, 2023

This PR is continued in #429.

@pietern pietern closed this Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants