This is so workers can call these functions.
This was preventing the [Delete Room Admin
API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#version-2-new-version)
from succeeding when `block: true` was specified. This was because we
had `run_background_tasks_on` configured to run on a separate worker.
As workers weren't able to call the `block_room` storage function before
this PR, the (delete room) task failed when taken off the queue by the
worker.
Previously, a value like `5q` would be interpreted as 5 milliseconds. We
should just raise an error instead of letting someone run with a
misconfiguration.
This PR changes the logic so that deactivated users are always ignored.
Suspended users were already effectively ignored as Synapse forbids a
join while suspended.
---------
Co-authored-by: Devon Hudson <devon.dmytro@gmail.com>
This also happens for rejecting an invite. Basically, any out-of-band membership transition where we first get the membership as an `outlier` and then rely on federation filling us in to de-outlier it.
This PR mainly addresses automated test flakiness, bots/scripts, and options within Synapse like [`auto_accept_invites`](https://element-hq.github.io/synapse/v1.122/usage/configuration/config_documentation.html#auto_accept_invites) that are able to react quickly (before federation is able to push us events), but also helps in generic scenarios where federation is lagging.
I initially thought this might be a Synapse consistency issue (see issues labeled with [`Z-Read-After-Write`](https://github.com/matrix-org/synapse/labels/Z-Read-After-Write)) but it seems to be an event auth logic problem. Workers probably do increase the number of possible race condition scenarios that make this visible though (replication and cache invalidation lag).
Fix https://github.com/element-hq/synapse/issues/15012
(probably fixes https://github.com/matrix-org/synapse/issues/15012 (https://github.com/element-hq/synapse/issues/15012))
Related to https://github.com/matrix-org/matrix-spec/issues/2062
Problems:
1. We don't consider [out-of-band membership](https://github.com/element-hq/synapse/blob/develop/docs/development/room-dag-concepts.md#out-of-band-membership-events) (outliers) in our `event_auth` logic even though we expose them in `/sync`.
1. (This PR doesn't address this point) Perhaps we should consider authing events in the persistence queue as events already in the queue could allow subsequent events to be allowed (events come through many channels: federation transaction, remote invite, remote join, local send). But this doesn't save us in the case where the event is more delayed over federation.
### What happened before?
I wrote some Complement test that stresses this exact scenario and reproduces the problem: https://github.com/matrix-org/complement/pull/757
```
COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestSynapseConsistency
```
We have `hs1` and `hs2` running in monolith mode (no workers):
1. `@charlie1:hs2` is invited and joins the room:
1. `hs1` invites `@charlie1:hs2` to a room which we receive on `hs2` as `PUT /_matrix/federation/v1/invite/{roomId}/{eventId}` (`on_invite_request(...)`) and the invite membership is persisted as an outlier. The `room_memberships` and `local_current_membership` database tables are also updated which means they are visible down `/sync` at this point.
1. `@charlie1:hs2` decides to join because it saw the invite down `/sync`. Because `hs2` is not yet in the room, this happens as a remote join `make_join`/`send_join` which comes back with all of the auth events needed to auth successfully and now `@charlie1:hs2` is successfully joined to the room.
1. `@charlie2:hs2` is invited and and tries to join the room:
1. `hs1` invites `@charlie2:hs2` to the room which we receive on `hs2` as `PUT /_matrix/federation/v1/invite/{roomId}/{eventId}` (`on_invite_request(...)`) and the invite membership is persisted as an outlier. The `room_memberships` and `local_current_membership` database tables are also updated which means they are visible down `/sync` at this point.
1. Because `hs2` is already participating in the room, we also see the invite come over federation in a transaction and we start processing it (not done yet, see below)
1. `@charlie2:hs2` decides to join because it saw the invite down `/sync`. Because `hs2`, is already in the room, this happens as a local join but we deny the event because our `event_auth` logic thinks that we have no membership in the room ❌ (expected to be able to join because we saw the invite down `/sync`)
1. We finally finish processing the `@charlie2:hs2` invite event from and de-outlier it.
- If this finished before we tried to join we would have been fine but this is the race condition that makes this situation visible.
Logs for `hs2`:
```
🗳️ on_invite_request: handling event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=False>
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
🔦 _store_room_members_txn update local_current_membership: <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
📨 Notifying about new event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
✅ on_invite_request: handled event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True>
🧲 do_invite_join for @user-2-charlie1:hs2 in !sfZVBdLUezpPWetrol:hs1
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$bwv8LxFnqfpsw_rhR7OrTjtz09gaJ23MqstKOcs7ygA, type=m.room.member, state_key=@user-1-alice:hs1, membership=join, outlier=True>
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$oju1ts3G3pz5O62IesrxX5is4LxAwU3WPr4xvid5ijI, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=join, outlier=False>
📨 Notifying about new event <FrozenEventV3 event_id=$oju1ts3G3pz5O62IesrxX5is4LxAwU3WPr4xvid5ijI, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=join, outlier=False>
...
🗳️ on_invite_request: handling event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
🔦 _store_room_members_txn update local_current_membership: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
📨 Notifying about new event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
✅ on_invite_request: handled event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True>
📬 handling received PDU in room !sfZVBdLUezpPWetrol:hs1: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
📮 handle_new_client_event: handling <FrozenEventV3 event_id=$WNVDTQrxy5tCdPQHMyHyIn7tE4NWqKsZ8Bn8R4WbBSA, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=join, outlier=False>
❌ Denying new event <FrozenEventV3 event_id=$WNVDTQrxy5tCdPQHMyHyIn7tE4NWqKsZ8Bn8R4WbBSA, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=join, outlier=False> because 403: You are not invited to this room.
synapse.http.server - 130 - INFO - POST-16 - <SynapseRequest at 0x7f460c91fbf0 method='POST' uri='/_matrix/client/v3/join/%21sfZVBdLUezpPWetrol:hs1?server_name=hs1' clientproto='HTTP/1.0' site='8080'> SynapseError: 403 - You are not invited to this room.
📨 Notifying about new event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
✅ handled received PDU in room !sfZVBdLUezpPWetrol:hs1: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False>
```
Otherwise these can race with other long running queries and lock out
all other queries.
This caused problems in v1.22.0 as we added an index to `events` table
in #17948, but that got interrupted and so next time we ran the
background update we needed to delete the half-finished index. However,
that got blocked behind some long running queries and then locked other
queries out (stopping workers from even starting).
This is particularly a problem in a state reset scenario where the membership
might change without a corresponding event.
This PR is targeting a scenario where a state reset happens which causes
room membership to change. Previously, the cache would just hold onto
stale data and now we properly bust the cache in this scenario.
We have a few tests for these scenarios which you can see are now fixed
because we can remove the `FIXME` where we were previously manually
busting the cache in the test itself.
This is a general Synapse thing so by it's nature it helps out Sliding
Sync.
Fix https://github.com/element-hq/synapse/issues/17368
Prerequisite for https://github.com/element-hq/synapse/issues/17929
---
Match when are busting `_curr_state_delta_stream_cache`
Adds a query param `type` to `/_synapse/admin/v1/rooms/{room_id}/state`
that filters the state event query by state event type.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Refactor `get_profile` to avoid returning "empty" (`None` / `null`)
fields. Currently this is not very important, but will be more useful
once #17488 lands. It does update the servlet to use this now which has
a minor change in behavior: additional fields served over federation
will now be properly sent back to clients.
It also adds constants for `avatar_url` / `displayname` although I did
not attempt to use it everywhere possible.
`defer.returnValue` was only needed in Python 2; in Python 3, a simple
`return` is fine.
`twisted.internet.defer.returnValue` is deprecated as of Twisted 24.7.0.
Most uses of `returnValue` in synapse were removed a while back; this
cleans up some remaining bits.
This is essentially matrix-org/synapse#14392. I didn't see anything in
there about updating sytest or complement.
The main driver of this is so that I can use `jsonb_path_exists` in
#17488. 😄
Fixes various `mypy` errors associated with Twisted `24.11.0`.
Hopefully addresses https://github.com/element-hq/synapse/issues/17075,
though I've yet to test against `trunk`.
Changes should be compatible with our currently pinned Twisted version
of `24.7.0`.
Supersedes https://github.com/element-hq/synapse/pull/17958.
Awkwardly, the changes made to fix the mypy errors in 1.12.1 cause
errors in 1.11.2. So you'll need to update your mypy version to 1.12.1
to eliminate typechecking errors during developing.
The existing `email.smtp_host` config option is used for two distinct
purposes: it is resolved into the IP address to connect to, and used to
(request via SNI and) validate the server's certificate if TLS is
enabled. This new option allows specifying a different name for the
second purpose.
This is especially helpful, if `email.smtp_host` isn't a global FQDN,
but something that resolves only locally (e.g. "localhost" to connect
through the loopback interface, or some other internally routed name),
that one cannot get a valid certificate for.
Alternatives would of course be to specify a global FQDN as
`email.smtp_host`, or to disable TLS entirely, both of which might be
undesirable, depending on the SMTP server configuration.
Another config option on my quest to a `*_path` variant for every
secret. This time it’s `macaroon_secret_key_path`.
Reading secrets from files has the security advantage of separating the secrets from the config. It also simplifies secrets management in Kubernetes. Also useful to NixOS users.
- Fetch the number of invites the provided user has sent after a given
timestamp
- Fetch the number of rooms the provided user has joined after a given
timestamp, regardless if they have left/been banned from the rooms
subsequently
- Get report IDs of event reports where the provided user was the sender
of the reported event
When rejecting a withdrew invite through federation, an out of band
event needs to be created.
When doing so with a third_party_rules module installed,
`get_prev_state_ids` [is
called](e0fdb862cb/synapse/module_api/callbacks/third_party_event_rules_callbacks.py (L285))
on the context to calculate the state to pass at `check_event_allowed`
callbacks.
The context for outliers is defined
[here](e0fdb862cb/synapse/events/snapshot.py (L168)),
and `state_group_before_event` is None.
This change makes the behavior of `get_prev_state_ids` and
`get_current_state_ids` match the one presented in the docstring
regarding null state_group.
POST requests for account data, receipts and presence require the worker
to be configured as a stream writer. The regular expressions in the
default list don't assume any HTTP method, so if the worker is not a
stream writer, the request fails.
The stream writer section of the documentation lists the same regexps as
the one I'm removing, so people configuring stream writers can still
configure their routing properly.
More context:
https://github.com/element-hq/synapse/issues/17243#issuecomment-2493621645
This is an implementation of MSC4190, which allows appservices to manage
their user's devices without /login & /logout.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Be able to test `/login/sso/redirect` in Complement
Spawning from
https://github.com/element-hq/sbg/pull/421#discussion_r1854926218 where
we have a proxy that intercepts responses to
`/_matrix/client/v3/login/sso/redirect(/{idpId})` in order to upgrade
them to use OAuth 2.0 Pushed Authorization Requests (PAR). We have some
Complement tests in that codebase that go over this flow and these
changes are required [in order for the URL's to line
up](d648c8ce3f/synapse/rest/client/login.py (L652-L673)).
Currently, when a new scheduled task is added and its scheduled time has
already passed, we set it to ACTIVE. This is problematic, because it
means it will jump the queue ahead of all other SCHEDULED tasks;
furthermore, if the Synapse process gets restarted, it will jump ahead
of any ACTIVE tasks which have been started but are taking a while to
run.
Instead, we leave it set to SCHEDULED, but kick off a call to
`_launch_scheduled_tasks`, which will decide if we actually have
capacity to start a new task, and start the newly-added task if so.