Skip to content

feat: Adopt HTTP apis for web deployment#393

Merged
cguedes merged 31 commits into
mainfrom
381-reimplement-flow-for-pdf-uploads-to-work-with-the-new-http-backend-api
Aug 25, 2023
Merged

feat: Adopt HTTP apis for web deployment#393
cguedes merged 31 commits into
mainfrom
381-reimplement-flow-for-pdf-uploads-to-work-with-the-new-http-backend-api

Conversation

@cguedes

@cguedes cguedes commented Aug 22, 2023

Copy link
Copy Markdown
Collaborator

Part of work for #347 and closes #381

This PR connect the client code (web hosting) to backend APIs via (external) HTTP using standard fetch.
A follow-up PR will existing after #417 is merged, that will implement the unified codebase for the desktop+web client code.

It's also important to note that for this PR the goal was to not change the existing desktop behaviour, implementing if/else checks with import.meta.env.VITE_IS_WEB.

For this PR the main changes where:

  • remove in-memory implementation of fs.tsand stub code. Now the code throws exception.
  • adopt the new project API and therefore, save the projectId in the client to use in every HTTP call.
  • implement web.ts and *API.ts utilities to perform calls to the backend HTTP in a web context.
  • Adjust the code to flow the projectId to every business logic call in order to support listing, opening, upload files, and also references operations (ingestion, list, delete, edit).
  • Adopt the new HTTP api for the chat operation. The rewrite is still using the sidecar command.

Notes:

  • The code/files structure wasn't a priority for this PR. That organisation will be implemented in the follow-up PR.
  • (web) the open project will open one of the previously created projects (in the web context)

Screenshot 2023-08-25 at 14 02 35

@cguedes cguedes linked an issue Aug 22, 2023 that may be closed by this pull request
Comment thread python/sidecar/http.py Outdated
@codecov

codecov Bot commented Aug 22, 2023

Copy link
Copy Markdown

Codecov Report

Merging #393 (890d347) into main (dfd2a6d) will increase coverage by 1.16%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##             main     #393      +/-   ##
==========================================
+ Coverage   82.65%   83.81%   +1.16%     
==========================================
  Files         186       17     -169     
  Lines       10884     1168    -9716     
  Branches     1142        0    -1142     
==========================================
- Hits         8996      979    -8017     
+ Misses       1873      189    -1684     
+ Partials       15        0      -15     
Files Changed Coverage Δ
python/web.py 0.00% <0.00%> (ø)
python/sidecar/http.py 81.92% <100.00%> (ø)

... and 168 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@cguedes cguedes changed the title feat: Reimplement flow for pdf uploads to work with the new http backend api 🚧 🚧 feat: Reimplement flow for pdf uploads to work with the new http backend api Aug 23, 2023

@danvk danvk left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments for you. In general we should try to avoid as much "adapter" code as we can since both the server and client are under our control. Rather than adapting the server response in client code, just change the server to return what we want.

Comment thread src/api/projectsAPI.ts Outdated

export async function getRemoteProject(projectId: string) {
const res = await fetch(`/api/projects/${projectId}`);
const data: ProjectIdResponse = res.json() as unknown as ProjectIdResponse;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should adapt the codegen setup before we go too far down this path. I can take a crack at that.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. We need to iron the projects API in line with what we have discussed early (ex: stop exposing the internal server path). Created a ticket for that here.

Comment thread src/api/projectsAPI.ts
},
});
const payload = (await response.json()) as Record<string, string>;
const projectId = Object.keys(payload)[0];

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update the Python API to produce the response type that we want.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

Comment thread src/web.ts Outdated
Comment thread src/web.ts
export async function readProjectFilesFromWeb(projectId: string) {
const projectInfo = await getRemoteProject(projectId);
const relativePaths = projectInfo.filepaths
.map((path) => path.replace(projectInfo.project_path + '/', '')) // Remove project path from API output

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this logic all be on the python side or here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something for #414

Comment thread src/web.ts
// ########################################################################################
// FILESYSTEM
// ########################################################################################
export const writeTextFileFromWeb = async (projectId: string, filePath: string, contents: string) => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export const writeTextFileFromWeb = writeRemoteTextFile;

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. In the end maybe I won't have the ...FromWeb and ...Remote... versions. So I will keep this in mind.

.filter((x) => !!x)
.join('/')) as NormalizedPath;
}
const fs = new Map<NormalizedPath, FakeNode>();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very happy that this code had such a short life :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-) me too.

Comment thread src/wrappers/tauri-api-stubs/fs.ts Outdated
export const readBinaryFile: typeof tauriFs.readBinaryFile = async (filePath) => {
const f = await readTextFile(filePath);
return new TextEncoder().encode(f);
/** MIGRATED */

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this needs to stay in, you can use @deprecated to make it appear struck-through as a reminder at call sites.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure we will remove this :-) very soon.

Comment thread src/io/filesystem.ts
return Promise.all(entries.map(convertTauriFileEntryToFileEntry));
if (isRunningOnWeb()) {
console.log('reading file structure from web');
const entries = await readProjectFilesFromWeb(currentProjectId);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to factor out as much shared code as you can, i.e.

let entries;
if (isRunningOnWeb()) {
  entries = await readProjectFilesFromWeb(currentProjectId);
} else {
  entries = await readDir(systemBaseDir, { recursive: true });
}
return Promise.all(entries.map(convertTauriFileEntryToFileEntry));

the appeal of the tauri-api-stubs approach was that it naturally avoided this kind of duplication.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This time I've prioritised making the two code paths clear. I think that this will very soon be the same code for desktop and web.

Comment thread src/io/filesystem.ts
}

let currentProjectId = '';
export function setCurrentProjectId(projectId: string) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicative of the atom state. We should either pass the project ID to the filesystem functions or, if that involves too much repetition, we could create a FilesystemForProject class that takes it once in the constructor.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharp eyes @danvk :-).

I wasn't sure (still not 100% sure) how to make the current project id available throughout the app.
I started with storing the id in the project store, but then figured out that for the filesystem calls would be much simpler to just save the currentProjectId and use it for the "web".

With the "unification" of desktop and web I think that this will be cleaner in your next review.

Comment thread python/sidecar/http.py Outdated
# -----------
@sidecar_api.post("/ingest")
async def http_ingest(project_id: str) -> IngestResponse:
async def http_ingest(project_id: str, request: Request) -> IngestResponse:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do these handlers all need to take a request now?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They needed before to set env vars from the headers. Now that code was moved to a middleware and I haven't removed the unused parameters. You can see that change here.

Shouldn't this be caught by the python linter? (during yarn check and git push)

Comment thread python/sidecar/http.py Outdated
# -----------
@sidecar_api.post("/rewrite")
async def http_rewrite(req: RewriteRequest) -> RewriteResponse:
async def http_rewrite(req: RewriteRequest, request: Request) -> RewriteResponse:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the second request parameter and why do we need it? It seems we don't use it anyway

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. We don't need it anymore. Removing....

Comment thread src/api/ingestion.ts
const response = await callSidecar('ingest', { pdf_directory: String(uploadsDir) });
return parsePdfIngestionResponse(response);
export async function runPDFIngestion(projectId?: string): Promise<ReferenceItem[]> {
if (import.meta.env.VITE_IS_WEB) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like having to make this check in each function. I would rather define an IngestionAPI abstract interface, with the methods we need, and then create 2 implementations WebIngestionAPI and SidecarIngestionAPI of this interface and export the correct implementation depending on VITE_IS_WEB

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't implement that bc we will remove the calls via sidecar for the references/ingestion process. Maybe I should make them into this branch actually!

Comment thread src/api/ingestion.ts
export async function runPDFIngestion(projectId?: string): Promise<ReferenceItem[]> {
if (import.meta.env.VITE_IS_WEB) {
if (!projectId) {
throw new Error('Project ID is required for web version');

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we make the parameter mandatory instead? The frontend should not depend on the implementation and should send an id in any case, so it's not useful to make it optional, even if we don't use it in the sidecar implementation

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That will be implemented in the follow-up PR with the unified desktop+web code to use HTTP.

: await save({ defaultPath: await getNewProjectsBaseDir() });
if (typeof newProjectPath === 'string') {
await newProject(newProjectPath);
if (import.meta.env.VITE_IS_WEB) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe this logic shouldn't live in this file

@cguedes cguedes changed the title 🚧 feat: Reimplement flow for pdf uploads to work with the new http backend api 🚧 feat: Adopt HTTP apis for web deployment Aug 25, 2023
@cguedes cguedes marked this pull request as ready for review August 25, 2023 13:03
Comment thread python/sidecar/http.py


@references_api.put("/{project_id}/{reference_id}")
@references_api.patch("/{project_id}/{reference_id}")

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gjreda FYI I've updated this to PATCH.

export const writeBinaryFile = import.meta.env.VITE_IS_WEB ? stubFs.writeBinaryFile : tauriFs.writeBinaryFile;
export const renameFile = import.meta.env.VITE_IS_WEB ? stubFs.renameFile : tauriFs.renameFile;
export const createDir = import.meta.env.VITE_IS_WEB
? dontUseFsInWeb<ReturnType<typeof tauriFs.createDir>>

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a sanity check. We will remove this in the follow-up PR.

Comment thread src/web.ts
const file = new File([content], fileName, { type: 'text/plain' });
data.append('file', file);

await fetch(`/api/fs/${projectId}/${filePath}`, {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. As this PR was exploratory, I didn't implemented those validations. Will make sure we have an issue to fix this within the context of #302 .

Comment thread src/atoms/projectState.ts
// -------------
// Web
// -------------
export const newWebProjectAtom = atom(null, async (_, set, projectId: string, path: string, name: string) => {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be unified with the desktop version in the follow-up PR.

@cguedes cguedes changed the title 🚧 feat: Adopt HTTP apis for web deployment feat: Adopt HTTP apis for web deployment Aug 25, 2023
@cguedes cguedes merged commit 325c4e7 into main Aug 25, 2023
@cguedes cguedes deleted the 381-reimplement-flow-for-pdf-uploads-to-work-with-the-new-http-backend-api branch August 25, 2023 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reimplement flow for PDF uploads to work with the new HTTP backend API

4 participants