Skip to content

The root_path of a request is not known when running make_absolute_url #690

@sm-Fifteen

Description

@sm-Fifteen

I've been looking into writing tests for and eventually solving #604 (see #681), but this has turned out to be difficult because of the way URLPath.make_absolute_url(self, base_url), URL.__init__(self, scope) and HTTPConnection.url_for(self, name) seem to interact with one another.

I was initially assuming that make_absolute_url worked somewhat like urllib.parse.urljoin, where a URL path can be made absolute by comparing it to a base URL, so that relative paths become absolute and absolute paths have all their missing components copied from the base URL. The problem I was trying to solve was that scope['root_path']. Based on that assumption, one would expect this test case to work:

def test_urlpath_make_absolute():
    url_path_abs = URLPath("/foo/bar", "http")
    url_path_rel = URLPath("fizz/buzz", "http")

    base_url = URL(scope={
        "type":"http",
        "server":("127.0.0.1", 8000),
        "headers": [(b"host", b"localhost:8000")],
        "path":"/urlpath",
        "query_string": b"",
        "scheme":"http",
    })

    abs_to_abs_url = url_path_abs.make_absolute_url(base_url)
    assert abs_to_abs_url == 'http://localhost:8000/foo/bar'
    
    rel_to_abs_url = url_path_rel.make_absolute_url(base_url)
    # rel_to_abs_url is actually 'http://localhost:8000/fizz/buzz'
    assert rel_to_abs_url == 'http://localhost:8000/urlpath/fizz/buzz'

That second assertion fails because make_absolute_url actually ignores most of the information from the base URL and instead creates a new URL by using the netloc and schema from the base URL, using the URLPath as the path.

https://github.com/encode/starlette/blob/c0bf5e3976542de2a0d0441bee0b5d7b2f83050e/starlette/datastructures.py#L172-L188

Admittedly, that's not a very big problem, but it becomes a lot more problematic when accounting for root_path, because HTTPConnection.url_for(self, name) uses the connection's URL as its base URL, and while the connection's URL does account for root_path when calculating the value of conn.url.path, that path then gets discarded by make_absolute_url. The reason why I don't think I can fix this myself is because there is no way to know what root_path was by the time make_absolute_url runs because the URL provided by HTTPConnection has already merged it with the requested path and does not remember it. URL or HTTPConnection would need to be modified to make this possible, and URL looks rather tightly coupled with the urllib.parse.urlsplit, so I don't know how moch margin for modification there even is with that datastructure.


I'd written those test cases (not the same as above, notice the use of root_path), but after looking into the inner workings of URL, I'm not even sure if they can be made to pass without heavy modifications to how URL works.

def test_urlpath_make_absolute_with_rootpath():
    url_path_abs = URLPath("/foo/bar", "http")
    url_path_rel = URLPath("fizz/buzz", "http")

    base_url = URL(scope={
        "type":"http",
        "server":("127.0.0.1", 8000),
        "headers": [(b"host", b"localhost:8000")],
        "path":"/urlpath",
        "root_path":"/root",
        "query_string": b"",
        "scheme":"http",
    })

    abs_to_abs_url = url_path_abs.make_absolute_url(base_url)
    assert abs_to_abs_url == 'http://localhost:8000/root/foo/bar'

    rel_to_abs_url = url_path_rel.make_absolute_url(base_url)
    assert rel_to_abs_url == 'http://localhost:8000/root/urlpath/fizz/buzz'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions