226 lines
11 KiB
Markdown
226 lines
11 KiB
Markdown
# Flex Object media proxy
|
|
|
|
**Status:** prototype (opt-in, disabled by default)
|
|
**Target:** flex-objects 1.4.x
|
|
**Related:** [getgrav/grav#4129](https://github.com/getgrav/grav/issues/4129)
|
|
|
|
## Problem
|
|
|
|
A Flex Object stores its data file and its uploaded media in the **same**
|
|
folder under `user://data/<type>/<key>/`:
|
|
|
|
```
|
|
user/data/contacts/0001/
|
|
item.yaml <- object data: must NEVER be web-served
|
|
avatar.jpg <- uploaded media: has always been web-served
|
|
resume.pdf <- uploaded file: may be public OR private
|
|
```
|
|
|
|
Historically the media was linked with a direct `/user/data/...` URL, so the
|
|
**webserver** — not Grav — decides who can read it. That has two consequences:
|
|
|
|
1. A blanket deny on `user/data` (added as a security hardening in Grav core)
|
|
returns a `403` for every Flex Object image. See grav#4129.
|
|
2. There is no way to keep a Flex Object's media *private*: anyone who knows or
|
|
guesses the path can fetch it, regardless of the object's read permissions.
|
|
|
|
The webserver-level mitigation (Grav core) keeps `user/data` denied **except**
|
|
for a fixed set of public media extensions, which unbreaks images without
|
|
re-exposing data files. That is the compatibility floor. This proxy is the
|
|
application-level half: **store in `user://data`, retrieve through Grav.**
|
|
|
|
## Goal
|
|
|
|
Serve a single Flex Object media file through PHP after resolving the owning
|
|
object and (optionally) checking its read ACL, so that:
|
|
|
|
- object media can live under a fully locked-down `user/data`;
|
|
- *private* object media is actually enforceable (per-object read permission);
|
|
- existing direct URLs keep working during the transition (the webserver
|
|
carve-out stays until content is fully migrated to proxy URLs).
|
|
|
|
## Target architecture
|
|
|
|
The intended end-state is **proxy-on by default**: the proxy is the *sole* path
|
|
to Flex Object media, and `user/data` is **full-deny** at the webserver. Opting
|
|
out reverts to **webserver-config-only** (direct URLs + the media carve-out).
|
|
|
|
```
|
|
media_proxy.enabled: true (target default) media_proxy.enabled: false (opt-out)
|
|
── Medium::url() for user://data originals ── Medium::url() emits direct
|
|
emits proxy URLs (images/ derivatives /user/data/... URLs
|
|
stay direct) ── webserver config must carry the
|
|
── webserver config = simple full-deny media carve-out (grav#4129) or
|
|
of user/data (no carve-out) images 403
|
|
── media delivery no longer depends on the ── media delivery depends entirely
|
|
webserver being configured at all on correct per-server config
|
|
```
|
|
|
|
**Why proxy-on is the more reliable default.** The auto-applied hardening only
|
|
covers Apache (`.htaccess`); nginx, Caddy, IIS and lighttpd users must hand-edit
|
|
their config, and the *media carve-out* is the fragile, hard-to-replicate part
|
|
(a per-server regex — see the IIS/Caddy caveats in the core change). With the
|
|
proxy as the sole media path, the webserver rule collapses to the simplest,
|
|
most portable thing there is — "deny `user/data`" — and **media keeps working
|
|
even on a server the admin never configured**, because the proxy serves through
|
|
`index.php`, which is always reachable.
|
|
|
|
**Important caveat — the proxy cannot protect the data files themselves.** A
|
|
direct request for `user/data/<type>/<key>/item.yaml` never reaches PHP, so Grav
|
|
cannot intercept it; only a webserver deny (or storing `user/` outside the web
|
|
root via `GRAV_USER_PATH`) keeps data files private. So the proxy makes *media
|
|
delivery* webserver-independent, but **data-file security still requires the
|
|
full-deny rule** — the win is that that rule is now the trivial, portable
|
|
one-liner instead of the fragile carve-out. Pushing `user/` out of the web root
|
|
remains the only way to make data-file security fully config-independent, and is
|
|
the recommended complementary hardening.
|
|
|
|
## Route
|
|
|
|
```
|
|
GET <base>/<type>/<key>/<filename>[?field=<field>]
|
|
```
|
|
|
|
- `base` — configurable, default `/flex-media`.
|
|
- `type` — flex directory key (e.g. `contacts`).
|
|
- `key` — object key.
|
|
- `filename` — media filename (may include sub-path segments).
|
|
- `field` — optional; resolve media from a specific field's collection
|
|
(file/avatar/pagemedia field with a custom `destination`) rather than the
|
|
object's own media.
|
|
|
|
Registered on `onPagesInitialized` at high priority (before the default flex
|
|
router and the 404 handler), gated by `media_proxy.enabled`.
|
|
|
|
## Behaviour
|
|
|
|
| Condition | Response |
|
|
|---|---|
|
|
| Proxy disabled | handler returns, request continues normally |
|
|
| `..`, leading `.`, or non-serveable extension in filename | `404` |
|
|
| Object missing / does not exist | `404` |
|
|
| `authorize: true` and `object.isAuthorized('read','frontend',user) === false` | `403` |
|
|
| Media item not found on the object | `404` |
|
|
| Fresh client copy (`If-None-Match` / `If-Modified-Since`) | `304` |
|
|
| Valid `Range` header | `206` partial |
|
|
| Otherwise | `200` streamed file |
|
|
|
|
Served responses set `Content-Type`, `Content-Length`, `Last-Modified`, `ETag`,
|
|
`Accept-Ranges`, `Cache-Control`, `X-Content-Type-Options: nosniff`, and
|
|
`Content-Disposition: inline`. The body is streamed from a file resource (full
|
|
requests) so large files are not buffered in memory.
|
|
|
|
### Serveable extensions
|
|
|
|
The proxy refuses anything outside an allow-list
|
|
(`jpg jpeg png gif webp avif bmp ico mp4 webm ogg ogv mov mp3 wav m4a flac pdf`)
|
|
so it can never hand out data files, databases or keys even if a caller crafts a
|
|
filename. **SVG is intentionally excluded** (stored-XSS vector), matching the
|
|
core `.htaccess` allow-list.
|
|
|
|
### Permission model
|
|
|
|
Reads are **not ACL-gated for now** — `authorize` defaults to `false`, so the
|
|
proxy is a pure routing/integrity gate (serve any existing media, no permission
|
|
check). Its purpose at this stage is a single retrieval chokepoint, not
|
|
per-object access control.
|
|
|
|
The capability is kept behind the flag: setting `authorize: true` makes the
|
|
proxy deny only on an **explicit** `false` from the object's read check, so
|
|
directories without a read ACL keep behaving as public media (no regression)
|
|
while directories that *do* define a read restriction get it enforced. Turn this
|
|
on only once the caching story for private media (below) is settled.
|
|
|
|
## Configuration
|
|
|
|
```yaml
|
|
media_proxy:
|
|
enabled: false # opt-in while prototyping
|
|
base: '/flex-media' # public route prefix
|
|
authorize: true # honour the object's read ACL
|
|
cache_control: 'public, max-age=604800'
|
|
```
|
|
|
|
## Generating URLs
|
|
|
|
When `media_proxy.enabled`, **`medium.url` already routes through the proxy** —
|
|
existing templates need no change:
|
|
|
|
```twig
|
|
<img src="{{ object.media['avatar.jpg'].url }}"> {# proxied original #}
|
|
<img src="{{ object.media['avatar.jpg'].cropResize(300,300).url }}"> {# derivative, served from images/ #}
|
|
```
|
|
|
|
This works via three pieces (all behind the flag):
|
|
|
|
1. **Core — `onFlexObjectMedia` event** (`FlexMediaTrait::getMedia()`). Fired once
|
|
per object when its media collection is first built, passing the object and
|
|
the collection so a listener can stamp a `url` override on each item.
|
|
2. **Plugin — listener** stamps `MediaProxyController::url($object, $filename)`
|
|
onto every media item as its `url` override.
|
|
3. **Core — `ImageMedium::url()`** honors that override **only for the unmodified
|
|
original** — gated on `empty($this->image)`, the same condition under which
|
|
`saveImage()` returns the source file. The moment a modifier is applied
|
|
(`$this->image` is instantiated) the override is skipped and the derivative
|
|
serves from `images/`. Non-image files already honor the override via
|
|
`MediaFileTrait::url()`.
|
|
|
|
`flex_media_url(object, filename, field)` / `MediaProxyController::url(...)` remain
|
|
available for explicit links (e.g. field-scoped media not covered by the
|
|
object's own collection).
|
|
|
|
### Known nuances (prototype)
|
|
|
|
- The override is returned verbatim, so the media-timestamp cache-buster
|
|
(`?<mtime>`) is not appended to proxied originals. The proxy already sends
|
|
`ETag`/`Last-Modified`, but add the timestamp if byte-identical replacement
|
|
under the same filename must bust shared caches.
|
|
- `srcset()` retina alternatives of an unmodified image also resolve to the
|
|
override; per-alternative handling can come later.
|
|
- The persistent media cache stores the **un-stamped** collection (the event
|
|
fires after `MediaTrait::getMedia()` caches it), so toggling `enabled` only
|
|
needs a normal cache clear — and admin requests (no listener) are unaffected.
|
|
|
|
## Open items / core follow-up
|
|
|
|
1. **Automatic URL rewriting (core) — DONE.** `medium.url` now emits the proxy
|
|
URL when `media_proxy.enabled`, via the `onFlexObjectMedia` event +
|
|
`ImageMedium::url()` override described under "Generating URLs". Implemented in
|
|
`system/src/Grav/Framework/Flex/Traits/FlexMediaTrait.php` (event) and
|
|
`system/src/Grav/Common/Page/Medium/ImageMedium.php` (override).
|
|
|
|
**Scope — only `user://data` originals, never `images/`** (satisfied by the
|
|
implementation). The override is honored *only* on the original branch; the
|
|
moment any modifier is applied (`.cropResize(...)`, `.resize(...)`, etc.) the
|
|
result lives under `images/` and is served direct, untouched. Only `images/`
|
|
is exempt — no other path. Remaining: the stamping listener currently applies
|
|
to every object's own media collection regardless of where it is stored;
|
|
restricting it to `user://data`-backed directories (vs. custom destinations)
|
|
is a follow-up knob.
|
|
2. **Re-tighten `user/data` to full-deny (target default).** Once content emits
|
|
proxy URLs, the core `.htaccess` media carve-out (grav#4129) is dropped and
|
|
`user/data` goes back to a simple full-deny, because the proxy — not the
|
|
webserver — serves the `user://data` originals (derivatives already live under
|
|
`images/`). Sequencing matters: don't re-tighten until rewriting is in and
|
|
content is migrated.
|
|
|
|
**Config coordination.** The webserver rule that's correct depends on
|
|
`media_proxy.enabled`, but the rule lives in core's installer
|
|
(`.htaccess`/`webserver-configs/`), which doesn't read the plugin setting. So
|
|
the two modes can't both be auto-correct from one shipped file. Resolution:
|
|
ship the **carve-out** form by default (safe whether or not the proxy is on —
|
|
it never *blocks* media), and have the Admin security check detect
|
|
`media_proxy.enabled = true` and recommend (or, for Apache, offer to apply)
|
|
the simpler full-deny. Flipping the shipped default to full-deny only makes
|
|
sense once proxy-on is itself the default and the `Medium::url()` rewrite has
|
|
landed.
|
|
3. **Caching headers vs. private media.** `cache_control: public` is correct
|
|
while reads are not ACL-gated. If `authorize` is later turned on, hits to
|
|
restricted objects must instead send `private, no-store`; production should
|
|
vary the header by whether a read restriction applied. This is the gate that
|
|
must be built before `authorize: true` is recommended.
|
|
4. **Range/conditional hardening.** Multi-range requests are not supported
|
|
(single range only); revisit if large-video streaming is a real use case.
|
|
5. **Signed URLs (optional).** For private media shared via expiring links,
|
|
consider an HMAC-signed variant of the route as an alternative to session ACL.
|