I’ve been quiet a minute, but never not working. Lately, I want a homelab replacement for Pocket allowing me to easily bookmark content from desktop or mobile, benefit from AI summarization and auto-tagging, and sync to my mobile device for offline (airplane mode) reading later. No single tool I’ve found does this automatically today, so I glued some pieces together.

This setup uses Karakeep as the bookmark capture tool and Wallabag as the offline reading tool. Both run on a homelab in proxmox LXCs, accessible away from home only via Tailscale.

The goal is not just to copy URLs from Karakeep to Wallabag. That works for many sites, but fails when Wallabag tries to fetch a site that blocks server-side readers with Cloudflare or another bot check.

Instead, this method sends Wallabag the content Karakeep already archived.

Desired flow

1. Bookmark saved in Karakeep
2. Karakeep crawls and archives the page
3. Sync script reads Karakeep's archived HTML asset
4. Sync script posts URL + title + archived HTML to Wallabag
5. Wallabag stores the article for offline reading

Why this exists

Wallabag is good for offline reading, but its normal API flow is URL-based:

1. POST URL to Wallabag
2. Wallabag fetches the page itself

That can fail on sites that present bot checks, JavaScript challenges, or Cloudflare interstitials.

Karakeep, especially when saving through the browser extension, may already have a usable archived copy of the page. Karakeep exposes that saved content as an asset. The sync script uses that asset instead of asking Wallabag to fetch the page again.

Assumptions

This example assumes:

Karakeep URL:   https://karakeep.example.internal
Wallabag URL:   http://wallabag.internal:8000
Script path:    /opt/karakeep-wallabag-sync/sync.py
Env file:       /etc/karakeep-wallabag-sync.env
State file:     /var/lib/karakeep-wallabag-sync/synced.json

Replace these with your own values.

For internal homelab use, it may be simpler to use direct internal HTTP URLs between containers instead of routing through a reverse proxy with internal TLS certificates.

1. Create API credentials

Karakeep

Create a Karakeep API key from the Karakeep UI.

Example placeholder:

KARAKEEP_API_KEY=PASTE_KARAKEEP_API_KEY_HERE

Wallabag

Create a Wallabag API client from Wallabag’s API client management page.

You need:

WALLABAG_CLIENT_ID=PASTE_WALLABAG_CLIENT_ID_HERE
WALLABAG_CLIENT_SECRET=PASTE_WALLABAG_CLIENT_SECRET_HERE
WALLABAG_USERNAME=PASTE_WALLABAG_USERNAME_HERE
WALLABAG_PASSWORD=PASTE_WALLABAG_PASSWORD_HERE

Wallabag’s API uses OAuth password grant in this setup, so the script needs both the API client credentials and the Wallabag user credentials.

2. Install dependencies

Run this on the machine or container where the sync script will live.

In this example, the script runs inside the Karakeep LXC/container.

mkdir -p /opt/karakeep-wallabag-sync
mkdir -p /var/lib/karakeep-wallabag-sync

cd /opt/karakeep-wallabag-sync

python3 -m venv .venv
. .venv/bin/activate
pip install requests

3.Create the environment file

Create:

nano /etc/karakeep-wallabag-sync.env

Example:

KARAKEEP_BASE_URL=https://karakeep.example.internal
KARAKEEP_API_KEY=PASTE_KARAKEEP_API_KEY_HERE

WALLABAG_BASE_URL=http://wallabag.internal:8000
WALLABAG_CLIENT_ID=PASTE_WALLABAG_CLIENT_ID_HERE
WALLABAG_CLIENT_SECRET=PASTE_WALLABAG_CLIENT_SECRET_HERE
WALLABAG_USERNAME=PASTE_WALLABAG_USERNAME_HERE
WALLABAG_PASSWORD=PASTE_WALLABAG_PASSWORD_HERE

STATE_FILE=/var/lib/karakeep-wallabag-sync/synced.json

# Wait at least this long after bookmark creation before syncing.
# This avoids racing Karakeep before its archive assets are ready.
MIN_BOOKMARK_AGE_SECONDS=120

# Ignore tiny archive assets.
MIN_ARCHIVE_CHARS=500

Lock it down:

chmod 600 /etc/karakeep-wallabag-sync.env

4. Create the sync script

Create:

nano /opt/karakeep-wallabag-sync/sync.py

Paste:

#!/usr/bin/env python3

import json
import os
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import Any

import requests


# ----------------------------
# Config from environment file
# ----------------------------

KARAKEEP_BASE_URL = os.environ["KARAKEEP_BASE_URL"].rstrip("/")
KARAKEEP_API_KEY = os.environ["KARAKEEP_API_KEY"]

WALLABAG_BASE_URL = os.environ["WALLABAG_BASE_URL"].rstrip("/")
WALLABAG_CLIENT_ID = os.environ["WALLABAG_CLIENT_ID"]
WALLABAG_CLIENT_SECRET = os.environ["WALLABAG_CLIENT_SECRET"]
WALLABAG_USERNAME = os.environ["WALLABAG_USERNAME"]
WALLABAG_PASSWORD = os.environ["WALLABAG_PASSWORD"]

STATE_FILE = Path(
    os.environ.get(
        "STATE_FILE",
        "/var/lib/karakeep-wallabag-sync/synced.json",
    )
)

MIN_BOOKMARK_AGE_SECONDS = int(os.environ.get("MIN_BOOKMARK_AGE_SECONDS", "120"))
MIN_ARCHIVE_CHARS = int(os.environ.get("MIN_ARCHIVE_CHARS", "500"))

KARAKEEP_HEADERS = {
    "Authorization": f"Bearer {KARAKEEP_API_KEY}",
    "Accept": "application/json",
}


# ----------------------------
# State file
# ----------------------------

def load_state() -> set[str]:
    if not STATE_FILE.exists():
        return set()

    try:
        data = json.loads(STATE_FILE.read_text())
    except json.JSONDecodeError:
        raise RuntimeError(f"State file is not valid JSON: {STATE_FILE}")

    if not isinstance(data, list):
        raise RuntimeError(f"State file should contain a JSON list: {STATE_FILE}")

    return set(str(x) for x in data)


def save_state(synced_urls: set[str]) -> None:
    STATE_FILE.parent.mkdir(parents=True, exist_ok=True)
    tmp = STATE_FILE.with_suffix(".tmp")
    tmp.write_text(json.dumps(sorted(synced_urls), indent=2))
    tmp.replace(STATE_FILE)


# ----------------------------
# Date helpers
# ----------------------------

def parse_karakeep_datetime(value: Any) -> datetime | None:
    if not isinstance(value, str) or not value.strip():
        return None

    raw = value.strip()

    if raw.endswith("Z"):
        raw = raw[:-1] + "+00:00"

    try:
        return datetime.fromisoformat(raw)
    except ValueError:
        return None


def bookmark_age_seconds(bookmark: dict[str, Any]) -> float | None:
    created_at = parse_karakeep_datetime(bookmark.get("createdAt"))
    if not created_at:
        return None

    if created_at.tzinfo is None:
        created_at = created_at.replace(tzinfo=timezone.utc)

    return (datetime.now(timezone.utc) - created_at.astimezone(timezone.utc)).total_seconds()


# ----------------------------
# Encoding helpers
# ----------------------------

def mojibake_score(text: str) -> int:
    """
    Higher score means text looks more likely to contain mojibake.

    Example of mojibake:
      let’s  ->  letâ??s
    """
    bad_markers = [
        "â", "Â", "Ã", "?",
        "\x80", "\x81", "\x82", "\x83", "\x84", "\x85", "\x86", "\x87",
        "\x88", "\x89", "\x8a", "\x8b", "\x8c", "\x8d", "\x8e", "\x8f",
        "\x90", "\x91", "\x92", "\x93", "\x94", "\x95", "\x96", "\x97",
        "\x98", "\x99", "\x9a", "\x9b", "\x9c", "\x9d", "\x9e", "\x9f",
    ]
    return sum(text.count(marker) for marker in bad_markers)


def repair_mojibake(text: str) -> str:
    """
    Repair common UTF-8-as-Latin1/Windows-1252 mojibake.
    """
    original_score = mojibake_score(text)
    if original_score == 0:
        return text

    candidates = [text]

    for encoding in ("latin1", "cp1252"):
        try:
            candidates.append(text.encode(encoding).decode("utf-8"))
        except UnicodeError:
            pass

    best = min(candidates, key=mojibake_score)

    if mojibake_score(best) < original_score:
        print(f"  Repaired mojibake: score {original_score} -> {mojibake_score(best)}")
        return best

    return text


def decode_response_text(response: requests.Response) -> str:
    """
    Decode Karakeep archive asset content.

    requests may guess ISO-8859-1 for text/html without charset.
    Karakeep HTML assets are usually UTF-8, so try UTF-8 first.
    """
    raw = response.content

    for encoding in ("utf-8-sig", "utf-8"):
        try:
            return repair_mojibake(raw.decode(encoding).strip())
        except UnicodeDecodeError:
            pass

    guessed = response.encoding or response.apparent_encoding or "utf-8"
    return repair_mojibake(raw.decode(guessed, errors="replace").strip())


# ----------------------------
# Karakeep API
# ----------------------------

def karakeep_get(path: str, params: dict[str, Any] | None = None) -> Any:
    response = requests.get(
        f"{KARAKEEP_BASE_URL}/api/v1{path}",
        headers=KARAKEEP_HEADERS,
        params=params,
        timeout=60,
    )
    response.raise_for_status()
    return response.json()


def karakeep_get_raw(path: str) -> requests.Response:
    response = requests.get(
        f"{KARAKEEP_BASE_URL}/api/v1{path}",
        headers=KARAKEEP_HEADERS,
        timeout=120,
    )
    response.raise_for_status()
    return response


def extract_items(data: Any) -> list[dict[str, Any]]:
    if isinstance(data, list):
        return [x for x in data if isinstance(x, dict)]

    if not isinstance(data, dict):
        return []

    for key in ("bookmarks", "items", "data", "results"):
        value = data.get(key)
        if isinstance(value, list):
            return [x for x in value if isinstance(x, dict)]

    return []


def get_next_cursor(data: Any) -> str | None:
    if not isinstance(data, dict):
        return None

    value = (
        data.get("nextCursor")
        or data.get("next_cursor")
        or data.get("cursor")
    )

    return value if isinstance(value, str) and value.strip() else None


def get_all_karakeep_bookmarks() -> list[dict[str, Any]]:
    bookmarks: list[dict[str, Any]] = []
    cursor = None

    while True:
        params: dict[str, Any] = {"limit": 100}
        if cursor:
            params["cursor"] = cursor

        data = karakeep_get("/bookmarks", params=params)
        bookmarks.extend(extract_items(data))

        cursor = get_next_cursor(data)
        if not cursor:
            break

    return bookmarks


def get_full_karakeep_bookmark(bookmark_id: str) -> dict[str, Any]:
    data = karakeep_get(f"/bookmarks/{bookmark_id}")
    if not isinstance(data, dict):
        raise RuntimeError(f"Unexpected bookmark response for {bookmark_id}")
    return data


def extract_url(bookmark: dict[str, Any]) -> str | None:
    candidates = [
        bookmark.get("url"),
        bookmark.get("sourceUrl"),
        bookmark.get("canonicalUrl"),
    ]

    content = bookmark.get("content")
    if isinstance(content, dict):
        candidates.extend(
            [
                content.get("url"),
                content.get("sourceUrl"),
                content.get("canonicalUrl"),
            ]
        )

    link = bookmark.get("link")
    if isinstance(link, dict):
        candidates.extend(
            [
                link.get("url"),
                link.get("sourceUrl"),
                link.get("canonicalUrl"),
            ]
        )

    for url in candidates:
        if isinstance(url, str) and url.startswith(("http://", "https://")):
            return url.strip()

    return None


def extract_title(bookmark: dict[str, Any]) -> str | None:
    candidates = [
        bookmark.get("title"),
        bookmark.get("name"),
    ]

    content = bookmark.get("content")
    if isinstance(content, dict):
        candidates.extend(
            [
                content.get("title"),
                content.get("name"),
            ]
        )

    link = bookmark.get("link")
    if isinstance(link, dict):
        candidates.extend(
            [
                link.get("title"),
                link.get("name"),
            ]
        )

    for value in candidates:
        if isinstance(value, str) and value.strip():
            return repair_mojibake(value.strip())

    return None


def crawl_status(bookmark: dict[str, Any]) -> str | None:
    content = bookmark.get("content")
    if not isinstance(content, dict):
        return None

    value = content.get("crawlStatus")
    return value if isinstance(value, str) else None


def find_archive_asset_id(bookmark: dict[str, Any]) -> tuple[str, str] | tuple[None, None]:
    """
    Prefer Karakeep's extracted readable HTML asset first.

    Priority:
      1. linkHtmlContent
      2. fullPageArchive
    """

    assets = bookmark.get("assets")

    if isinstance(assets, list):
        for asset in assets:
            if not isinstance(asset, dict):
                continue

            if asset.get("assetType") == "linkHtmlContent":
                asset_id = asset.get("id")
                if isinstance(asset_id, str) and asset_id.strip():
                    return asset_id.strip(), "linkHtmlContent"

    content = bookmark.get("content")
    if isinstance(content, dict):
        asset_id = content.get("fullPageArchiveAssetId")
        if isinstance(asset_id, str) and asset_id.strip():
            return asset_id.strip(), "fullPageArchive"

    if isinstance(assets, list):
        for asset in assets:
            if not isinstance(asset, dict):
                continue

            if asset.get("assetType") == "fullPageArchive":
                asset_id = asset.get("id")
                if isinstance(asset_id, str) and asset_id.strip():
                    return asset_id.strip(), "fullPageArchive"

    return None, None


def download_karakeep_archive_content(asset_id: str, asset_type: str) -> str | None:
    response = karakeep_get_raw(f"/assets/{asset_id}")

    content_type = response.headers.get("Content-Type", "unknown")
    text = decode_response_text(response)

    if len(text) < MIN_ARCHIVE_CHARS:
        print(
            f"  Archive asset too small; skipping for now: "
            f"type={asset_type}, id={asset_id}, chars={len(text)}, minimum={MIN_ARCHIVE_CHARS}"
        )
        return None

    if "Just a moment" in text[:3000]:
        print(
            f"  WARNING: archive asset appears to contain a bot-check page: "
            f"type={asset_type}, id={asset_id}"
        )

    print(f"  Using Karakeep asset type={asset_type}, id={asset_id}")
    print(f"  Downloaded archive asset Content-Type={content_type}, chars={len(text)}")
    print(f"  Archive preview: {text[:250].replace(chr(10), ' ').replace(chr(13), ' ')}")

    return text


def extract_inline_archived_content(bookmark: dict[str, Any]) -> str | None:
    """
    Only use true inline HTML/text fields.
    Do not use content.description here; that is metadata/snippet, not archive content.
    """

    candidates = [
        bookmark.get("html"),
        bookmark.get("contentHtml"),
        bookmark.get("text"),
        bookmark.get("contentText"),
        bookmark.get("fullText"),
    ]

    content = bookmark.get("content")
    if isinstance(content, dict):
        candidates.extend(
            [
                content.get("html"),
                content.get("htmlContent"),
                content.get("contentHtml"),
                content.get("text"),
                content.get("contentText"),
                content.get("fullText"),
            ]
        )

    for value in candidates:
        if isinstance(value, str):
            cleaned = repair_mojibake(value.strip())
            if len(cleaned) >= MIN_ARCHIVE_CHARS:
                return cleaned

    return None


def extract_archived_content(bookmark: dict[str, Any]) -> str | None:
    inline = extract_inline_archived_content(bookmark)
    if inline:
        print(f"  Using inline Karakeep archived content, chars={len(inline)}")
        print(f"  Inline preview: {inline[:250].replace(chr(10), ' ').replace(chr(13), ' ')}")
        return inline

    asset_id, asset_type = find_archive_asset_id(bookmark)
    if asset_id and asset_type:
        return download_karakeep_archive_content(asset_id, asset_type)

    return None


def should_wait_for_karakeep(bookmark: dict[str, Any]) -> tuple[bool, str]:
    age = bookmark_age_seconds(bookmark)
    if age is not None and age < MIN_BOOKMARK_AGE_SECONDS:
        return True, f"bookmark is too new: age={age:.1f}s, minimum={MIN_BOOKMARK_AGE_SECONDS}s"

    status = crawl_status(bookmark)
    if status and status.lower() not in {"success", "failed"}:
        return True, f"crawlStatus is not finished: {status}"

    return False, ""


# ----------------------------
# Wallabag API
# ----------------------------

def get_wallabag_token() -> str:
    response = requests.post(
        f"{WALLABAG_BASE_URL}/oauth/v2/token",
        data={
            "grant_type": "password",
            "client_id": WALLABAG_CLIENT_ID,
            "client_secret": WALLABAG_CLIENT_SECRET,
            "username": WALLABAG_USERNAME,
            "password": WALLABAG_PASSWORD,
        },
        timeout=60,
    )
    response.raise_for_status()

    data = response.json()
    token = data.get("access_token")
    if not isinstance(token, str) or not token:
        raise RuntimeError(f"Wallabag token response did not include access_token: {data}")

    return token


def wallabag_add_entry(
    token: str,
    url: str,
    title: str | None,
    content: str,
) -> None:
    content = repair_mojibake(content)

    data = {
        "url": url,
        "content": content,
        "tags": "karakeep,auto-import,karakeep-archive",
    }

    if title:
        data["title"] = repair_mojibake(title)

    print(f"  Sending to Wallabag: url={url}")
    print(f"  Sending title: {title or '(none)'}")
    print(f"  Sending content chars: {len(content)}")

    response = requests.post(
        f"{WALLABAG_BASE_URL}/api/entries.json",
        headers={
            "Authorization": f"Bearer {token}",
            "Accept": "application/json",
        },
        data=data,
        timeout=120,
    )

    if response.status_code not in (200, 201):
        raise RuntimeError(
            f"Wallabag failed for {url}: {response.status_code} {response.text}"
        )

    try:
        result = response.json()
        wallabag_id = result.get("id")
        stored_title = result.get("title")
        stored_content = result.get("content")

        print(f"  Wallabag response status: {response.status_code}")
        print(f"  Wallabag entry id: {wallabag_id}")
        print(f"  Wallabag stored title: {stored_title}")
        if isinstance(stored_content, str):
            stored_content = repair_mojibake(stored_content)
            print(f"  Wallabag response content chars: {len(stored_content)}")
            print(f"  Wallabag response preview: {stored_content[:250].replace(chr(10), ' ').replace(chr(13), ' ')}")
        else:
            print("  Wallabag response content chars: unavailable")
    except Exception:
        print(f"  Wallabag response status: {response.status_code}")
        print("  Wallabag response was not parsed as JSON.")


# ----------------------------
# Main
# ----------------------------

def main() -> int:
    synced = load_state()

    print("Loading Karakeep bookmarks...")
    bookmarks = get_all_karakeep_bookmarks()
    print(f"Found {len(bookmarks)} Karakeep bookmarks.")

    print("Getting Wallabag token...")
    token = get_wallabag_token()

    added = 0
    skipped_synced = 0
    skipped_too_new = 0
    skipped_not_ready = 0
    skipped_no_url = 0
    skipped_no_archive = 0
    failed = 0

    for bookmark in bookmarks:
        bookmark_id = bookmark.get("id")
        url = extract_url(bookmark)

        if not isinstance(bookmark_id, str) or not bookmark_id.strip():
            continue

        if not url:
            skipped_no_url += 1
            continue

        if url in synced:
            skipped_synced += 1
            continue

        print()
        print(f"Processing bookmark id={bookmark_id}")
        print(f"URL: {url}")

        try:
            full_bookmark = get_full_karakeep_bookmark(bookmark_id)

            should_wait, reason = should_wait_for_karakeep(full_bookmark)
            if should_wait:
                if "too new" in reason:
                    skipped_too_new += 1
                else:
                    skipped_not_ready += 1
                print(f"  Result: skipping for now; {reason}")
                continue

            title = extract_title(full_bookmark) or extract_title(bookmark)
            content = extract_archived_content(full_bookmark)

            if not content:
                skipped_no_archive += 1
                print("  Result: skipping for now; no usable Karakeep archive content found")
                continue

            print("  Result: adding with Karakeep archive content")

            wallabag_add_entry(
                token=token,
                url=url,
                title=title,
                content=content,
            )

            synced.add(url)
            added += 1

        except Exception as exc:
            failed += 1
            print(f"FAILED: {url}: {exc}", file=sys.stderr)

    save_state(synced)

    print()
    print(
        "Done. "
        f"bookmarks={len(bookmarks)}, "
        f"added={added}, "
        f"skipped_synced={skipped_synced}, "
        f"skipped_too_new={skipped_too_new}, "
        f"skipped_not_ready={skipped_not_ready}, "
        f"skipped_no_url={skipped_no_url}, "
        f"skipped_no_archive={skipped_no_archive}, "
        f"failed={failed}"
    )

    return 1 if failed else 0


if __name__ == "__main__":
    raise SystemExit(main())

Make it executable:

chmod +x /opt/karakeep-wallabag-sync/sync.py

5. Test manually

Load the environment file and run the script:

cd /opt/karakeep-wallabag-sync

set -a
. /etc/karakeep-wallabag-sync.env
set +a

. .venv/bin/activate

python -m py_compile sync.py
./sync.py

Expected successful output includes lines like:

Using Karakeep asset type=linkHtmlContent
Downloaded archive asset Content-Type=text/html, chars=...
Sending content chars: ...
Wallabag response status: 200

If a bookmark was created too recently, the script should skip it:

Result: skipping for now; bookmark is too new

If Karakeep has not created an archive asset yet, the script should skip it:

Result: skipping for now; no usable Karakeep archive content found

The script should not send URL-only entries to Wallabag.

6. Automate with systemd

Create the service:

cat >/etc/systemd/system/karakeep-wallabag-sync.service <<'EOF'
[Unit]
Description=Sync Karakeep bookmarks to Wallabag using Karakeep archived content
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
EnvironmentFile=/etc/karakeep-wallabag-sync.env
WorkingDirectory=/opt/karakeep-wallabag-sync
ExecStart=/opt/karakeep-wallabag-sync/.venv/bin/python /opt/karakeep-wallabag-sync/sync.py
StandardOutput=journal
StandardError=journal
EOF

Create the timer:

cat >/etc/systemd/system/karakeep-wallabag-sync.timer <<'EOF'
[Unit]
Description=Run Karakeep to Wallabag sync every 10 minutes

[Timer]
OnBootSec=2min
OnUnitActiveSec=10min
Unit=karakeep-wallabag-sync.service
Persistent=true

[Install]
WantedBy=timers.target
EOF

Enable it:

systemctl daemon-reload
systemctl enable --now karakeep-wallabag-sync.timer

Check the timer:

systemctl status karakeep-wallabag-sync.timer
systemctl list-timers | grep karakeep

Force a run:

systemctl start karakeep-wallabag-sync.service

View logs:

journalctl -u karakeep-wallabag-sync.service -n 150 --no-pager

Follow logs live:

journalctl -u karakeep-wallabag-sync.service -f

7. How duplicate prevention works

The script keeps a local JSON file of URLs already synced to Wallabag:

/var/lib/karakeep-wallabag-sync/synced.json

A URL is added to that file only after Wallabag accepts the entry.

This matters because skipped bookmarks should be retried later. For example:

Bookmark too new        - skipped, not marked synced
No archive yet          - skipped, not marked synced
Wallabag API failure    - failed, not marked synced
Wallabag success        - marked synced

To force a re-test of a URL:

Delete the entry from Wallabag.
Remove the URL from synced.json.
Run the script again.

8. Race-condition protection

A bookmark can appear in Karakeep before its archive assets are ready.

This script avoids that by using two guards.

First, it waits for a minimum bookmark age:

MIN_BOOKMARK_AGE_SECONDS=120

Second, it checks Karakeep’s crawl status and skips unfinished bookmarks.

The goal is to avoid this bad sequence:

1. Bookmark created
2. Sync runs immediately
3. No Karakeep archive exists yet
4. Script sends URL-only to Wallabag
5. Wallabag fetches blocked page
6. Bad article is stored

Instead, the script does this:

1. Bookmark created
2. Sync runs too early
3. Script skips it
4. Next timer run retries
5. Karakeep archive exists
6. Script sends archived content to Wallabag

9. Encoding repair

Some pages may contain mojibake such as:

letâ??s

instead of:

let’s

The script attempts to repair common UTF-8 decoded as Latin-1 or Windows-1252 before sending content to Wallabag.

This is handled by:

repair_mojibake(...)

It is not a universal character-encoding solution, but it handles the common apostrophe/quote corruption case.

10. Summary

The finished setup uses:

Karakeep as the capture/archive source
Wallabag as the offline reader
Python as the bridge
systemd timer as the scheduler

The important behavior is that the script sends Karakeep’s archived HTML content to Wallabag, not just the URL.

That lets Wallabag store readable articles even when Wallabag’s own fetcher would be blocked by the site.