138 lines
6.2 KiB
Markdown
138 lines
6.2 KiB
Markdown
# intake
|
|
|
|
Intake is an arbitrary feed aggregator that generalizes the concept of a feed.
|
|
Rather than being restricted to parsing items out of an RSS feed, Intake provides a middle layer of executing arbitrary commands that conform to a JSON-based specification.
|
|
An Intake source can parse an RSS feed, but it can also scrape a website without a feed, provide additional logic to filter or annotate feed items, or integrate with an API.
|
|
|
|
## Development
|
|
|
|
Parity with existing Python version
|
|
|
|
* [x] create sources
|
|
* [ ] rename sources
|
|
* fetch sources
|
|
* [x] create and delete items
|
|
* [x] update existing items
|
|
* [ ] support item TTL and TTD
|
|
* [x] on_create triggers
|
|
* [ ] on_delete triggers
|
|
* [x] dry-run
|
|
* item actions
|
|
* [x] create
|
|
* [x] edit
|
|
* [ ] rename
|
|
* [x] delete
|
|
* [x] execute
|
|
* [ ] require items to declare action support
|
|
* [ ] state files
|
|
* [ ] source environment
|
|
* [ ] working directory set
|
|
* [ ] update web UI credentials
|
|
* [ ] automatic crontab integration
|
|
* [ ] feed supports item TTS
|
|
* [ ] data directory from envvars
|
|
* [ ] source-level tt{s,d,l}
|
|
* [ ] source batching
|
|
* channels
|
|
* [ ] create
|
|
* [ ] edit
|
|
* [ ] rename
|
|
* [ ] delete
|
|
* feeds
|
|
* [x] show items
|
|
* [x] deactivate items
|
|
* [ ] mass deactivate
|
|
* [ ] punt
|
|
* [ ] trigger actions
|
|
* [x] add ad-hoc items
|
|
* [ ] show/hide deactivated items
|
|
* [ ] show/hide tts items
|
|
* [ ] NixOS module
|
|
* [ ] NixOS module demo
|
|
|
|
Additional features
|
|
|
|
* [ ] metric reporting
|
|
* [ ] on action failure, create an error item with logs
|
|
* [ ] first-party password handling instead of basic auth and htpasswd
|
|
* [ ] items gracefully add new fields and `action` keys
|
|
* [ ] arbitrary date punt
|
|
* [ ] HTTP edit item
|
|
* [ ] sort crontab entries
|
|
* [ ] TUI feed view
|
|
|
|
## Overview
|
|
|
|
In Intake, a _source_ represents a single content feed of discrete _items_, such as a blog and its posts or a website and its pages.
|
|
Each source has associated _actions_, which are executable commands.
|
|
The `fetch` action checks the feed and returns the items in a JSON format.
|
|
Each item returned by a fetch is stored by Intake and appears in that feed's source.
|
|
When you have read an item, you can deactivate it, which hides it from your feed.
|
|
When a deactivated item is no longer returned by `fetch`, it is deleted.
|
|
This allows you to consume feed content at your own pace without missing anything.
|
|
|
|
Intake stores all its data in a SQLite database.
|
|
This database is stored in `$INTAKE_DATA_DIR`, `$XDG_DATA_HOME/intake`, or `$HOME/.local/share/intake`, whichever is resolved first.
|
|
The database can also be specified on the command line via `--data-dir`/`-d` instead of the environment.
|
|
|
|
### Items
|
|
|
|
Items are passed between Intake and sources as JSON objects.
|
|
Only the `id` field is required.
|
|
Any unspecified field is equivalent to the empty string, object, or 0, depending on field's type.
|
|
|
|
| Field name | Specification | Description |
|
|
| ---------- | ------------- | ----------- |
|
|
| `id` | **Required** | A unique identifier within the source.
|
|
| `source` | **Automatic** | The source that produced the item.
|
|
| `created` | **Automatic** | The Unix timestamp at which Intake first processed the item.
|
|
| `active` | **Automatic** | Whether the item is active and displayed in feeds.
|
|
| `title` | Optional | The title of the item. If an item has no title, `id` is used as a fallback title.
|
|
| `author` | Optional | An author name associated with the item. Displayed in the item footer.
|
|
| `body` | Optional | Body text of the item as raw HTML. This will be displayed in the item without further processing! Consider your sources' threat models against injection attacks.
|
|
| `link` | Optional | A hyperlink associated with the item.
|
|
| `time` | Optional | A Unix timestamp associated with the item, not necessarily when the item was created. Items sort by `time` when it is defined and fall back to `created`. Displayed in the item footer.
|
|
|
|
Existing items are updated with new values when a fetch or action produces them, with some exceptions:
|
|
|
|
* Automatic fields cannot be changed.
|
|
* If a field's previous value is non-empty and the new value is empty, the old value is kept.
|
|
|
|
### Sources
|
|
|
|
A source is identified by its name. A minimally functional source requires a `fetch` action that returns items.
|
|
|
|
### Action API
|
|
|
|
The Intake action API defines how programs should behave to be used with Intake sources.
|
|
|
|
To execute an action, Intake executes the command specified by that action's `argv`.
|
|
The process's environment is as follows:
|
|
|
|
* `intake`'s environment is inherited.
|
|
* `STATE_PATH` is set to the absolute path of a file containing the source's persistent state.
|
|
|
|
When an action receives an item as input, that item's JSON representation is written to that action's `stdin`.
|
|
When an action outputs an item, it should write the item's JSON representation to `stdout` on one line.
|
|
All input and output is assumed to be UTF-8.
|
|
If an item cannot be parsed or the exit code of the process is nonzero, Intake will consider the action to be a failure.
|
|
No items will be created or updated as a result of the failed action.
|
|
Anything written to `stderr` by the action will be captured and logged by Intake.
|
|
|
|
The `fetch` action receives no input and outputs multiple items.
|
|
This action is executed when a source is updated.
|
|
The `fetch` action is the core of an Intake source.
|
|
|
|
All other actions take an item as input and should output the same item with any modifications made by the action.
|
|
Actions can only be executed for an item if that item has a key with the same name in its `action` field.
|
|
The value of that key may be any non-null JSON value used to pass state to the action.
|
|
|
|
The special action `on_create` is always run when an item is first returned by a fetch.
|
|
The item does not need to declare support for `on_create`.
|
|
This action is not accessible through the web interface, so if you need to retry the action, you should create another action with the same command as `on_create`.
|
|
If an item's `on_create` fails, the item is still created, but without any changes made by action.
|
|
|
|
The special action `on_delete` is like `on_create`, except it runs right before an item is deleted.
|
|
It does not require explicit support and is not accessible in the web interface.
|
|
The output of `on_delete` is ignored; it is primarily for causing side effects like managing state.
|