Discovery

Before we can make an HTTP request to a server, we need to know the hostname of the server to which the request will be made. This hostname may not necessarily be the same as the user’s server name from the user ID. This is similar to the way that the hostname for a user’s IMAP and SMTP servers is not necessarily the same as the domain name in their email address.

The process of determining the hostname for a Matrix homeserver is called discovery. Discovery can be done in different ways. For the client, the easiest way is to simply be told what hostname to use, such as by having the user type it in. However, this is usually not very user-friendly, although it could be suitable in some cases. For example, in a corporate environment, where a system administrator deploys Matrix clients for the organization, the system administrator could configure the hostname for all the users.

In general, the recommended way to discover the homeserver’s hostname from the user’s server name, either by parsing the user ID (the server name can be found by taking everything after the first :), or by having the server name specified separately from the username. After removing the port number (if present), the client makes a request to https://{serverName}/.well-known/matrix/client. If this results in a 200 HTTP status code with a response body that is a JSON object (note that the HTTP Content-Type header does not need to indicate that the body is JSON) of the form

{
  "m.homeserver": {
    "base_url": "{baseUrl}"
  }
}

(note that it may have other properties as well), and "{baseUrl}" is a string containing a valid URL, then this should be used as the base URL for HTTP requests. The client should also make an HTTP GET request to {baseUrl}/_matrix/client/versions to ensure that the URL actually points to a Matrix homeserver by checking that the response body follows the expected format. In particular, we will ensure that it is an object with a versions property that is a list of strings.

We will create a function in our client module that will perform discovery on a server name and return the discovery result, if any. It will return the discovered base client-server URL (the homeserver’s base URL with _matrix/client appended), as well as the contents of the .well-known/matrix/client file and the output of the /versions API call. The .well-known/matrix/client file may contain other useful information that may be of interest to the client, and the /versions API call will inform us of what features the homeserver supports.

client module functions:
async def discover(server_name: str) -> typing.Optional[tuple[str, dict, dict]]:
    """Perform discovery on the given server name.

    On success, returns a tuple consisting of homeserver's base client-server
    URL (the homeserver's base URL with ``_matrix/client`` appended), the full
    contents of the JSON object returned from the ``.well-known`` file, and the
    full contents of the server's ``/_matrix/client/versions`` response.  On
    failure, returns ``None``.
    """

    # strip off port number
    port_match = re.match(r"^(.*):\d+$", server_name)
    if port_match:
        server_name = port_match.group(1)

    # FIXME: ensure server_name only has valid characters

    try:
        discovery_url = f"https://{server_name}/.well-known/matrix/client"
        async with aiohttp.ClientSession() as session:
            async with session.get(discovery_url) as discovery_resp:
                assert discovery_resp.status == 200
                result = await discovery_resp.json(
                    content_type=None
                )  # ignore the content type

                # check discovered URL
                base_url = urlparse(result["m.homeserver"]["base_url"])
                if base_url.scheme != "http" and base_url.scheme != "https":
                    return None
                # ensure nonempty path ends with a "/" so that URL joining works
                if base_url.path != "" and not base_url.path.endswith("/"):
                    base_url.path += "/"

                base_client_url = urljoin(base_url.geturl(), "_matrix/client/")

                versions_url = urljoin(base_client_url, "versions")
                async with session.get(versions_url) as versions_resp:
                    code, versions = await check_response(versions_resp)
                    assert schema.is_valid(versions, {"versions": list[str]})

                return (base_client_url, result, versions)
    except Exception:
        return None
Tests

We write tests to ensure that our discover function operates as expected in different conditions.

test base:
@pytest.mark.asyncio
async def test_discover(mock_aioresponse):
    {{discovery tests}}

First, we test that it can perform successful discovery.

discovery tests:
mock_aioresponse.get(
    "https://example.org/.well-known/matrix/client",
    status=200,
    body='{"m.homeserver":{"base_url":"https://matrix-client.example.org"}}',
    headers={
        "content-type": "application/json",
    },
)
mock_aioresponse.get(
    "https://matrix-client.example.org/_matrix/client/versions",
    status=200,
    body='{"versions":["v1.1"]}',
    headers={
        "content-type": "application/json",
    },
)
assert await client.discover("example.org") == (
    "https://matrix-client.example.org/_matrix/client/",
    {"m.homeserver": {"base_url": "https://matrix-client.example.org"}},
    {"versions": ["v1.1"]},
)

We also test that it succeeds if the .well-known file is served with a non-JSON content-type.

discovery tests:
mock_aioresponse.get(
    "https://example.org/.well-known/matrix/client",
    status=200,
    body='{"m.homeserver":{"base_url":"https://matrix-client.example.org"}}',
    headers={
        "content-type": "text/plain",
    },
)
mock_aioresponse.get(
    "https://matrix-client.example.org/_matrix/client/versions",
    status=200,
    body='{"versions":["v1.1"]}',
    headers={
        "content-type": "application/json",
    },
)
assert await client.discover("example.org") == (
    "https://matrix-client.example.org/_matrix/client/",
    {"m.homeserver": {"base_url": "https://matrix-client.example.org"}},
    {"versions": ["v1.1"]},
)

Todo

test errors

Example: Discovery script

We can create a simple script that, given a Matrix server name, will print out the server discovery results.

examples/discover.py:
# {{copyright}}

"""Perform discovery on a Matrix server name"""

import asyncio
import json
import sys

from matrixlib import client


if len(sys.argv) != 2:
    print(__doc__)
    print()
    print(f"Usage: {sys.argv[0]} <servername>")
    exit(1)


async def main():
    print(f"Looking for Matrix server for {sys.argv[1]}...")
    res = await client.discover(sys.argv[1])
    if res == None:
        print("No Matrix server found")
    else:
        base, well_known, versions = res
        print(f"Base URL: {base}")
        print("Well-known data:")
        print(json.dumps(well_known, indent=2))
        print(f"Supported API versions: {versions['versions']}")


asyncio.run(main())

Here is the output from performing discovery on matrix.org as of the time of writing.

# python3 examples/discover.py matrix.org
Base URL: https://matrix-client.matrix.org/_matrix/client/
Well-known data:
{
  "m.homeserver": {
    "base_url": "https://matrix-client.matrix.org"
  },
  "m.identity_server": {
    "base_url": "https://vector.im"
  },
  "org.matrix.msc3575.proxy": {
    "url": "https://slidingsync.lab.matrix.org"
  }
}
Supported API versions: ['r0.0.1', 'r0.1.0', 'r0.2.0', 'r0.3.0', 'r0.4.0', 'r0.5.0', 'r0.6.0', 'r0.6.1', 'v1.1', 'v1.2', 'v1.3', 'v1.4', 'v1.5']