Self-hosting Immich for photos
A journey of missing parts
By bob on , 2463 words, 12 minutes to read
NOTE This post contains several product links. These are not sponsored/affiliate links, and I am not associated with the companies mentioned in any way.
Self-hosting everything because I can
I've been on a journey of moving more things in my life from cloud-hosted to self-hosting, starting with various "smart" devices in my house. While the cloud is convenient, there's always risks of a third-party company suddenly changing its policies or locking you out of your account.
The first device I replaced was my myQ garage door opener (replaced with Konnected), since the myQ cannot be controlled with Home Assistant (and is actively hostile towards it). The next device to go was my Ring doorbell (replaced with Reolink), because I was tired of paying Ring/Amazon a monthly fee just to use a doorbell. Also, both the myQ and Ring are pretty major security and privacy risks if you don't trust a third-party company with access to your house.
But this post isn't about smart devices, it's about pictures.
Self-hosted memories
Of all the various forms of data we have to keep track of, my wife and I decided the top priority was all our old pictures. We have pictures backed up on Google Photos and Amazon photos, and our phones and various old .zip archives, but none of it was really organized or fully under our control. My wife has also been getting warnings about her Google account storage space, the majority of which is taken up by pictures.
While browsing various self-hosting forums, one app kept appearing as a solution for hosting pictures: Immich.
Full featured, but not quite there yet
There is a banner at the top of the Immich website that currently states:
⚠️ The project is under very active development. Expect bugs and changes. Do not use it as the only way to store your photos and videos!
While so far my experience with Immich has been generally good, there's a reason for this warning. The project is not fully polished and I've encountered several (mostly minor) bugs so far. Nothing show-stopping, but don't expect a completely smooth experience out of the box.
Lots of install options, but no obvious easy path
At the time of this writing, the install guide has eight different options for
installing Immich, though the Quick start guide does point you toward the Docker
Compose option. There is an install script which is marked "Experimental", and
seems to basically just do the steps in the Docker Compose document. There are
no package manager (apt
, brew
, etc) install options.
The install instructions are to "create a directory of your choice" to hold the
relevant files, and just wget
them from the GitHub release into that
directory. I picked /opt/immich
instead of the relative ./immich-app
that
the install guide (and script) suggests. I left the postgres folder as the
relative ./postgres
, but changed the UPLOAD_LOCATION
to my external storage.
This is the most important setting to change if you're running Immich on a host
with multiple filesystems and you want to store all of your pictures on your
RAID/NAS/USB/whatever.
A little help from my friends (scripts)
Ok, logging in and typing docker compose up -d
every time isn't really a good
option. At the very least, I want something to start this on boot.
[Unit]
Description=Immich
After=docker.service
Requires=docker.service
[Service]
Type=simple
WorkingDirectory=/opt/immich
ExecStart=/usr/bin/docker compose up
ExecStop=/usr/bin/docker compose stop
[Install]
WantedBy=multi-user.target
The upgrade instructions just say to docker compose pull
to get the latest
images, but I also noticed that the image tag for the redis
container in the
docker-compose.yml
file changed in the first release that I upgraded to. So
apparently that needs to be updated sometimes, which the documentation does not
mention at all. I've settled on this for an upgrade script for now:
#!/bin/bash
set -euo pipefail
cd "$(dirname "$0")"
systemctl stop immich
wget -q --show-progress -O docker-compose.yml https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
docker compose pull
systemctl start immich
docker image prune -f
This obviously won't work if you have done any customization of your
docker-compose.yml
file, like using one of the CUDA-accelerated ML variants,
so YMMV.
Bulk importing 25 years of photos, and deduplication hell
Up to around 2015, I stored all of my photos on
Flickr. For reasons that I can't remember anymore, I
stopped using it and moved everything to Google Photos. Before I left, I made an
export of all of my data. The export consists of the actual image files, and a
bunch of .json
files containing all of the photo and album metadata. I added
descriptions/comments and had everything sorted into albums, which never made it
over to Google Photos.
Meanwhile, almost every picture I've taken since then has been on an Android phone, and the data has been moved from phone to phone every upgrade without losing anything (luckily). I also have miscellaneous folders and archives of backed-up photos which I don't remember being managed by anything (possibly f-spot).
Immich has a Node-based CLI that can recursively import from a directory, and there is also an immich-go version that can import from directories, zipfiles, or specifically Google Photos takeout archives. The web interface lets you upload multiple files from a standard file-picker dialog, but doesn't handle uploading archives. Everything is also available through an API, and there is a simple Python-based example of uploading a file.
Don't do what did.
Things would have been a lot simpler if I only had one unique copy of my photos,
but I had at least 3, possibly 4, slightly different copies of important photos
(like wedding pictures). I have the originals, the slightly reduced "storage
saver" version from Google Photos, and the Flickr versions that also seem to be
a slightly lower (or at least different) quality from the originals. Both the
web interface and the immich-go
tool are smart enough to skip files that are
100% identical, but will happily upload a re-encoded version of the same
picture. That sounds like it would be a complete disaster, but Immich runs a
background de-duplication job that uses ML to match nearly identical pictures
and suggest them for de-duplication. So there's a way to fix it, but it's still
miserable if you have thousands of duplicates to work through.
The order I ended up importing my files was:
- originals from my phone camera, via the mobile app
- miscellaneous backup folders and archives
- Flickr backup archives
By the time I got to importing the Flickr archives, immich-go
was flagging a
good number of them as duplicates and not uploading them. Which would be fine,
except that all of the album information was associated with the Flickr file
names. I also wanted to copy over names and descriptions of photos from Flickr,
which would be helpful when trying to figure out what a blurry photo from 2006
is supposed to be.
Since I hadn't specified the Flickr import to go into an album or anything, the
only way I could figure out how to identify the imported photos was that the
Flickr filenames all had a 10-digit ID in the names that matched the photo IDs
in the .json files. That only matched about half of the pictures, though. The
other half were flagged as duplicates by immich-go
. At this point, I probably
should have just wiped my install and started over, but I just kept stumbling
on. I eventually found that I could match on timestamps and find almost every
imported photo. I could then make API calls to create albums and set
descriptions for everything I had Flickr data for.
(It actually involved a lot more stumbling around than that, as I had to do several rounds of fine-tuning the filename and timestamp matching, then updating the albums and photos I had already created. Again, things might have been easier to just wipe and re-do. But this section is long enough already.)
Random Immich API codes
Immich has API documentation, but not a lot of guidance on how to actually accomplish anything. Since the web interface uses the API, I ended up figuring a lot out by just watching network traffic. My Flickr import script is a huge mess of commented out single-use code, but there's a few useful snippets:
For starters, there is no "list all photos" API endpoint. Photos are "assets" and assets are grouped in buckets in the timeline. If you want to get all of them (like if you're trying to do bulk matching of names and timestamps) you have to get all assets for each bucket:
def get_all_assets():
headers = {
'Accept': 'application/json',
'x-api-key': API_KEY
}
buckets = requests.get(f"{BASE_URL}/timeline/buckets?size=MONTH", headers=headers)
buckets.raise_for_status()
for bucket in buckets.json():
assets = requests.get(f"{BASE_URL}/timeline/bucket?size=MONTH&timeBucket={bucket['timeBucket']}", headers=headers)
assets.raise_for_status()
for asset in assets.json():
yield asset
Setting the description or changing the time for an asset in the web interface
uses the same PUT /assets/:id
endpoint:
def set_asset_data(id, description, date_taken):
headers = {
'Accept': 'application/json',
'x-api-key': API_KEY
}
data = {
"description": description,
"dateTimeOriginal": date_taken,
}
response = requests.put(f"{BASE_URL}/assets/{id}", headers=headers, json=data)
response.raise_for_status()
Omit description
or dateTimeOriginal
to only set one or the other.
Albums from the list /albums
endpoint do not actually contain the list of
assets in the album, you have to fetch each album individually to get that.
def get_albums():
headers = {
'Accept': 'application/json',
'x-api-key': API_KEY
}
albums = requests.get(f"{BASE_URL}/albums", headers=headers)
albums.raise_for_status()
for album in albums.json():
response = requests.get(f"{BASE_URL}/albums/{album['id']}", headers=headers)
response.raise_for_status()
yield response.json()
Side quest: what happened to my storage space?
At some point, I ran df -h
to get an idea of how much space I was using on my
external storage after all of these imports. I got a different surprise, though:
~$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 392M 6.9M 385M 2% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 15G 14G 1.0G 93% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sda2 2.0G 183M 1.7G 10% /boot
/dev/sdb1 5.5T 386G 4.8T 8% /media/external
tmpfs 392M 12K 392M 1% /run/user/1000
Wait, my root filesystem is almost full? Immich uses some decent sized Docker images, plus it pulls some large ML models to do face detection, etc. Still, it shouldn't be using that much storage. This is running on a Proxmox VM with the default 32GB thin-provisioned drive. Where did it all go?
Let's see what Docker is doing:
~$ docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 5 5 3.521GB 74.77MB (2%)
Containers 6 4 24.46kB 0B (0%)
Local Volumes 2 2 801.5MB 0B (0%)
Build Cache 0 0 0B 0B
Ok, that's not really a lot. Let's call it a round 5 GB. There's a few other things on this box, but not 32 GB worth. Where is the rest?
...
There's a hint in the previous df -h
output. There is a 2 GB boot partition,
and the root filesystem is 15 GB, not 30. The rest of the 32 GB drive is
mysteriously missing.
~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 32G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 2G 0 part /boot
└─sda3 8:3 0 30G 0 part
└─ubuntu--vg-ubuntu--lv 252:0 0 15G 0 lvm /
Ok, so the Ubuntu Server install defaults to using LVM and using the whole drive. Except that it doesn't use the whole drive. It makes a volume group for the whole drive, but only makes a logical volume that uses half of the space, leaving the other half unallocated. Apparently this is by design. Surprising and infuriating.
Fortunately, LVM makes this easy to fix without needing to take the server offline.
(Thank you, Stack Exchange):
~# lvextend -l +100%FREE /dev/mapper/ubuntu--vg-ubuntu--lv
[snip]
~# resize2fs /dev/mapper/ubuntu--vg-ubuntu--lv
[snip]
These commands complete almost instantly and resize the volume and filesystem to fill the rest of the drive.
~$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 392M 6.9M 385M 2% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 30G 12G 17G 43% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sda2 2.0G 183M 1.7G 10% /boot
/dev/sdb1 5.5T 386G 4.8T 8% /media/external
tmpfs 392M 12K 392M 1% /run/user/1000
Much better.
Backing up 50 GB of photos
So we have cloud "backups" with Google and Amazon, but the goal here was to reduce our dependencies on those services and keep everything under our control. So we still need some kind of remote backup, just something a little more private.
For now, I'm using rclone with Backblaze B2. The pricing page only gives you a price of $6/TB, but it's actually priced as increments of GB-per-hour. And the first 10 GB are free. So storing 50 GB of photos works out to just a few pennies.
Just in case, I set caps on everything to $1/day. I didn't expect to hit anywhere near that, but wanted to make sure I didn't start incurring huge charges by accident.
The backup job is a simple cron task for every day at 3am (Immich runs a database backup at 2am by default).
0 3 * * * rclone --b2-hard-delete sync /media/external/immich/ b2:REDACTED
Yes, I'm deleting and not hiding files. Maybe not a good idea for backups, but the database backups would otherwise be adding a new 80 MB file every day forever.
The next day, I noticed I was being charged $0.09 for the day in "Class C Transaction costs". Apparently this is related to listing files, and while you get 2,500 for free every day, I was running nearly 25,000!
In the rclone B2 docs, there is this very brief note:
--fast-list
This remote supports --fast-list which allows you to use fewer transactions in exchange for more memory. See the rclone docs for more details.
Yes, I would like to use fewer transactions, thank you.
0 3 * * * rclone --b2-hard-delete --fast-list sync /media/external/immich/ b2:REDACTED
I only noticed the transaction costs as I was writing this blog post, so I'll have to check back tomorrow and see if that's resolved. I did run that command as a one-off and it completed without running out of memory and only incremented the "class C transaction" count by a few. So here's hoping.
UPDATE: That worked, it's only showing 40 class C transactions for today.
Bugs, updates, other thoughts
Remember all of those duplicates that I had to clear out? Well, after deleting them, they stayed stuck in a broken state in the mobile app. I ended up clearing the app data to get rid of them. This is a known issue. The mobile app is also missing a lot of features that are only available through the web interface.
In the week (?) since I set up Immich, they've had two more releases. While it's nice to see that the project is being actively developed, it's a bit annoying to get an update notification every other day. Neither upgrade including breaking changes, though I did notice the redis tag update for Docker Compose, which prompted me to create my upgrade script. The release history shows quite a few breaking changes, so reading the release notes for each update will be necessary to avoid any surprises.
Neither of these should be surprising, as the project page very clearly warns you of these exact things. I'm still overall very impressed with the project, and might start helping out on issues where I can.