Storage model

Documentation of the PersonalMediaVault storage model.

Categories:

This is the documentation for the vault storage model, including the types of files it uses, their internal structure and the encryption algorithms used.

Use this document as reference for any software development that requires interaction with the vault files.

File types

The vault storage model uses different types of files:

Lock file: File used to prevent multiple instances of PersonalMediaVault accessing the same vault.
Unencrypted JSON files: Configuration files that do not contain any protected vault data.
Encrypted JSON files: Used to store metadata.
Index files: Used to store lists of media asset IDs, in order to make searching faster.
Encrypted assets: Encrypted files containing the media assets. They can be single-file or multi-file.

Lock file

The lock file has the .lock extension.

It stores in plain text, a decimal number representing the PID of the current Process accessing the vault.

PersonalMediaVault backend should check for the existence of this file and the process before accessing the vault.

Unencrypted JSON files

Unencrypted JSON files have the .json extension.

They follow the JSON format. The schema varies depending on the specific file.

Since they are not encrypted, they just store configuration, like the port it should listen, or the encryption parameters.

Encrypted JSON files

Encrypted JSON files have the .pmv extension.

They take as a base a JSON plaintext, that is encrypted using an algorithm like AES.

They are binary files, with the following structure:

Starting byte	Size (bytes)	Value name	Description
`0`	`2`	Algorithm ID	Identifier of the algorithm, stored as a Big Endian unsigned integer
`2`	`H`	Header	Header containing any parameters required by the encryption algorithm. The size depends on the algorithm used.
`2 + H`	`N`	Body	Body containing the raw encrypted data. The size depends on the initial unencrypted data and algorithm used.

The system is flexible enough to allow multiple encryption algorithms. Currently, there are 2 supported ones:

AES256_ZIP: ID = 1, Uses ZLIB (RFC 1950) to compress the data, and then uses AES with a key of 256 bits to encrypt the data, CBC as the mode of operation and an IV of 128 bits. This algorithm uses a header of 20 bytes, containing the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`4`	Compressed plaintext size	Size of the compressed plaintext, in bytes, used to remove padding
`4`	`16`	IV	Initialization vector for AES_256_CBC algorithm

AES256_FLAT: ID = 2, Uses AES with a key of 256 bits to encrypt the data, CBC as the mode of operation and an IV of 128 bits. This algorithm uses a header of 20 bytes, containing the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`4`	Plaintext size	Size of the plaintext, in bytes, used to remove padding
`4`	`16`	IV	Initialization vector for AES_256_CBC algorithm

Index files

Index files have the .index extension.

They are sorted lists of media assets identifiers. They can store all the existing identifiers, or a fraction of them, for example, for a tag.

Thanks to being sorted, searching for a specific identifier can be achieved using binary search.

They are binary files, consisting of the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`8`	Index size	Number of entries the index file contains, stored as a Big Endian unsigned integer
`8 + 8*K`	`8`	Media asset identifier	Each media asset identifier is stored as a Big Endian unsigned integer. They are stored next to each other, and already sorted from lower value to grater value

Encrypted assets

Encrypted assets have the .pma extension.

They store one or multiple encrypted files.

They are also binary files, and they can be of two types:

Single-File encrypted assets: They store a single file in size-limited chunks. Their name usually starts with s_.
Multi-File encrypted assets: They store multiple smaller files. Their name usually starts with m_.

Single-File encrypted assets

These asset files are used to store a single and possibly big file in chunks, encrypted each chunk using the same method described by the Encrypted JSON files section.

They are binary files consisting of 3 contiguous sections: The header, the chunk index and the encrypted chunks.

The header contains the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`8`	File size	Size of the original file, in bytes, stored as a Big Endian unsigned integer
`8`	`8`	Chunk size limit	Max size of a chunk, in bytes, stored as a Big Endian unsigned integer

After the header, the chunk index is stored. For each chunk the file was split into, the chunk index will store a metadata entry, with the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`8`	Chunk pointer	Starting byte of the chunk, stored as a Big Endian unsigned integer
`8`	`8`	Chunk size	Size of the chunk, in bytes, stored as a Big Endian unsigned integer

After the chunk index, the encrypted chunks are stored following the same structure described in the Encrypted JSON files section.

This chunked structure allows to randomly access any point in the file as a low cost, since you don’t need to decrypt the entire file, only the corresponding chunks. This capability is specially great for video rewinding and seeking.

Multi-File encrypted assets

These asset files are used to store multiple smaller files, meant to be sorted and accessed by an index number.

They are binary files consisting of 3 contiguous sections: The header, the file table and the encrypted files.

The header contains the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`8`	File count	Number of files stored by the asset, stored as a Big Endian unsigned integer

After the header, a file table is stored. For each file stored by the asset, a metadata entry is stored, with the following fields:

Starting byte	Size (bytes)	Value name	Description
`0`	`8`	File data pointer	Starting byte of the file encrypted data, stored as a Big Endian unsigned integer
`8`	`8`	File size	Size of the encrypted file, in bytes, stored as a Big Endian unsigned integer

After the file table, each file is stored following the same structure described in the Encrypted JSON files section.

This format is useful to store video previews, without the need to use too many files.

Vault folder structure

Media vaults are stored in folders. A vault folder may contain the following files and folders:

Name	Path	Type	Description
Media assets	`media`	Folder	Folder where media assets are stored.
Tag indexes	`tags`	Folder	Folder where tag indexes are stored.
Lock file	`vault.lock`	Lock file	File used to prevent multiple instances of the PersonalMediaVault backend to access a vault at the same time. It may not be present, in case the vault is not being accessed.
Credentials file	`credentials.json`	Unencrypted JSON file	File to store the existing accounts, along with the hashed credentials and the encrypted vault key, protected with the account password.
Media ID tracker	`media_ids.json`	Unencrypted JSON file	File to store the last used media asset ID.
Tasks tracker	`tasks.json`	Unencrypted JSON file	File used to store the last used task ID, along with the list of pending tasks.
Albums	`albums.pmv`	Encrypted JSON file	File used to store the existing albums, including the metadata and the list of media assets included in them.

Media assets folder

The media assets are stored inside the media folder.

In order to prevent the folder size to increase too much, the assets are distributed evenly in 256 sub-folders. The sub-folder name for each media asset is calculated from its identifier, since it’s a 64 bit unsigned integer, the folder name is the identifier module 256, and the result turned into a 2 character hex lowercased string

Examples: 00, 01, 02…, fd, fe, ff.

Inside each subfolder, the assets are stored inside their own folders, named by turning their identifier into a decimal string. Examples:

media_id=0 - Stored inside {VAULT_FOLDER}/media/00/0
media_id=15 - Stored inside {VAULT_FOLDER}/media/0f/15

import (
    "fmt",
    "hex",
    "path",
)

func GetMediaAssetFolder(vault_path string, media_id uint64) string {
    subFolderName := hex.EncodeToString([]byte{ byte(media_id % 256) });

    return path.Join(vault_path, "media", subFolderName, fmt.Sprint(media_id))
}

The media asset folder may contain up to 3 types of files:

Media asset metadata file: Named meta.pmv and used to store metadata.
Single-File assets: Named concatenating the s_ prefix and the asset ID in decimal, with .pma extension.
Multi-File assets: Named concatenating the m_ prefix and the asset ID in decimal, with .pma extension.

Media asset metadata file

Each media asset folder must contain a file named meta.pmv, being an encrypted JSON file containing the metadata of the media asset.

The file contains the following fields:

Field name	Type	Description
`id`	Number (64 bit unsigned integer)	Media asset identifier
`type`	Number (8 bit unsigned integer)	Media type. Can be: `1` (Image), `2` (Video / Animation) or `3` (Audio / Sound)
`title`	String	Title
`tags`	Array<Number (64 bit unsigned integer)>	List of tags for the media. Only identifiers are stored
`duration`	Number (Floating point)	Duration of the media in seconds
`width`	Number (32 bit unsigned integer)	Width in pixels
`height`	Number (32 bit unsigned integer)	Height in pixels
`fps`	Number (32 bit unsigned integer)	Frames per second
`upload_time`	Number (64 bit integer)	Upload timestamp (Unix milliseconds format)
`next_asset_id`	Number (64 bit unsigned integer)	Identifier to use for the next asset, when created
`original_ready`	Boolean	True if the original asset exists and is ready
`original_asset`	Number (64 bit unsigned integer)	Asset ID of the original asset. The original asset is Single-File
`original_ext`	String	Extension of the original asset file. Eg: `mp4`
`original_encoded`	Boolean	True if the original asset is encoded
`original_task`	Number (64 bit unsigned integer)	If the original asset is not encoded, the ID of the task assigned to encode it
`thumb_ready`	Boolean	True if the thumbnail asset exists and is ready
`thumb_asset`	Number (64 bit unsigned integer)	Asset ID of the thumbnail asset. The thumbnail asset is Single-File
`previews_ready`	Boolean	True if the video previews asset exists and is ready
`previews_asset`	Number (64 bit unsigned integer)	Asset ID of the video previews asset. The video previews asset is Multi-File
`previews_interval`	Number (Floating point)	Video previews interval in seconds
`previews_task`	Number (64 bit unsigned integer)	If the video previews asset is not ready, the ID of the task assigned to generate it
`force_start_beginning`	Boolean	True to indicate the player not to store the current playing time, so the video or audio starts from the beginning every time
`img_notes`	Boolean	True if the image has a notes asset
`img_notes_asset`	Number (64 bit unsigned integer)	Asset ID of the image notes asset. The image notes asset is Single-File
`resolutions`	Array<Resolution>	List of extra resolutions
`subtitles`	Array<Subtitle>	List of subtitles files
`time_splits`	Array<TimeSplit>	List of time splits for videos or audios
`audio_tracks`	Array<AudioTrack>	List of extra audio tracks for videos
`attachments`	Array<Attachment>	List of attachments stored with the media asset
`ext_desc`	Boolean	True only if the media has a description.
`ext_desc_asset`	Number (64 bit unsigned integer)	Id of the asset containing the description.
`related`	Array<Number (64 bit unsigned integer)>	List of IDs of related media assets

The Resolution object has the following fields:

Field name	Type	Description
`width`	Number (32 bit unsigned integer)	Width in pixels
`height`	Number (32 bit unsigned integer)	Height in pixels
`fps`	Number (32 bit unsigned integer)	Frames per second
`ready`	Boolean	True if the asset is ready
`asset`	Number (64 bit unsigned integer)	Asset ID of the asset. The asset is Single-File
`ext`	String	Asset file extension. Example: `mp4`
`task_id`	Number (64 bit unsigned integer)	If the asset is not ready, ID of the task assigned to encode it

The Subtitle object has the following fields:

Field name	Type	Description
`id`	String	Subtitles language identifier. Example: `eng`
`name`	String	Subtitles file name. Example `English`
`asset`	Number (64 bit unsigned integer)	Asset ID of the asset. The asset is Single-File

The TimeSplit object has the following fields:

Field name	Type	Description
`time`	Number (Floating point)	Time in seconds where the split starts
`name`	String	Name of the time split

The AudioTrack object has the following fields:

Field name	Type	Description
`id`	String	Audio track language identifier. Example: `eng`
`name`	String	Audio track file name. Example `English`
`asset`	Number (64 bit unsigned integer)	Asset ID of the asset. The asset is Single-File

The Attachment object has the following fields:

Field name	Type	Description
`id`	Number (64 bit unsigned integer)	Unique attachment identifier
`name`	String	Attachment file name
`size`	Number (64 bit unsigned integer)	Attachment file size (in bytes)
`asset`	Number (64 bit unsigned integer)	Asset ID of the asset. The asset is Single-File

The image notes asset is a JSON file, containing an array of ImageNote objects, with the following fields:

Field name	Type	Description
`x`	Number (32 bit integer)	X position (pixels)
`y`	Number (32 bit integer)	Y position (pixels)
`w`	Number (32 bit integer)	Width (pixels)
`h`	Number (32 bit integer)	Height (pixels)
`text`	String	Text to display for the specified area

Tag indexes folder

When a tag is added to the vault, a new index file is created inside the tags folder, with a name made by concatenating the tag_ prefix with the tag identifier encoded in decimal, and the .index extension.

import (
    "fmt",
    "path",
)

func GetTagIndexPath(vault_path string, tag_id uint64) string {
	return path.Join(vault_path, "tags", "tag_"+fmt.Sprint(tag_id)+".index")
}

Each tag index file contains the list of media asset identifiers that have such tag.

Credentials file

The credentials file, named credentials.json is an unencrypted JSON file used to store the hashed credentials, along with the encrypted vault key.

The JSON file contains the following fields:

Field name	Type	Description
`user`	String	Username of the root account
`pwhash`	String	Password hash. Base 64 encoded
`salt`	String	Hashing salt. Base 64 encoded
`enckey`	String	Encrypted key. Base 64 encoded
`method`	String	Name of the hashing + encryption method used
`tfa`	Boolean	True if two factor authentication is enabled
`tfa_method`	String	If two factor authentication is enabled, the method (eg: `totp:sha1:60:1`)
`tfa_enckey`	String	Encrypted two factor authentication key. Base 64 encoded
`auth_confirmation`	Boolean	True if the authentication confirmation is enabled
`auth_confirmation_method`	String	Authentication confirmation method (`tfa` or `pw`)
`auth_confirmation_period`	Number (32 bit unsigned integer)	Period (seconds) to prevent asking for authentication confirmation multiple consecutive times.
`fingerprint`	String	Vault fingerprint
`accounts`	Array<Account>	Array of additional accounts

Each Account is an object with the following fields:

Field name	Type	Description
`user`	String	Account username
`pwhash`	String	Password hash. Base 64 encoded
`salt`	String	Hashing salt. Base 64 encoded
`enckey`	String	Encrypted key. Base 64 encoded
`method`	String	Name of the hashing + encryption method used
`write`	Boolean	True if the account has permission to modify the vault
`tfa`	Boolean	True if two factor authentication is enabled
`tfa_method`	String	If two factor authentication is enabled, the method (eg: `totp:sha1:60:1`)
`tfa_enckey`	String	Encrypted two factor authentication key. Base 64 encoded
`auth_confirmation`	Boolean	True if the authentication confirmation is enabled
`auth_confirmation_method`	String	Authentication confirmation method (`tfa` or `pw`)
`auth_confirmation_period`	Number (32 bit unsigned integer)	Period (seconds) to prevent asking for authentication confirmation multiple consecutive times.

Currently, the following methods are implemented:

AES256 + SHA256 + SALT16 - Identifier: aes256/sha256/salt16

AES256 + SHA256 + SALT16

This algorithm uses a random salt of 16 bytes (128 bits).

The password hash is calculated by using the SHA256 algorithm 2 times on the binary concatenation of the password (as UTF-8) and the random salt:

import (
    "sha256"
)

func ComputePasswordHash(password string, salt []byte) []byte {
	firstHash := sha256.Sum256(append([]byte(password), salt...))
	secondHash := sha256.Sum256(firstHash[:])
	return secondHash[:]
}

The vault key is encrypted using the AES256 algorithm, using the system defined in the Encrypted JSON files section. Specifically using the AES256_FLAT mode.

The key for the encryption is calculated by hashing with SHA256 the the binary concatenation of the password (as UTF-8) and the random salt:

import (
    "sha256"
)

func ComputeAESEncryptionKey(password string, salt []byte) []byte {
	passwordHash := sha256.Sum256(append([]byte(password), salt...))
	return passwordHash[:]
}

Media ID tracker

The media ID tracker file, named media_ids.json is an unencrypted JSON file used to store the number of used media identifiers, very important to prevent duplicated identifiers.

The JSON file has just one field:

Field name	Type	Description
`next_id`	Number (64 bit unsigned integer)	Next identifier to use when adding a new media asset

Tasks tracker

The task tracker file, named tasks.json is an unencrypted JSON file used to store the number of used task identifiers, in order to prevent duplicates. It also stores the pending tasks, in order to continue them in case of a vault restart.

The JSON file contains the following fields:

Field name	Type	Description
`next_id`	Number (64 bit unsigned integer)	Next identifier to use when creating a new task
`pending`	Object (Mapping String -> PendingTask)	Mapping. For each pending task, the required metadata to restart them

The PendingTask objects have the following fields:

Field name	Type	Description
`id`	Number (64 bit unsigned integer)	Task identifier
`media_id`	Number (64 bit unsigned integer)	Media asset ID
`type`	Number (8 bit unsigned integer)	Task type. It can be: `0` (Encode original), `1` (Encode extra resolution) or `2` (Generate video previews)
`first_time_enc`	Boolean	True if this task is the first time the asset is being encoded (was just uploaded)
`resolution`	Object { `width`: Width (px), `height`: Height (px), `fps`: Frames per second }	Resolution for type = `1`

Albums file

The albums file, named albums.pmv is an encrypted JSON file used to store the list of existing albums in the vault.

The file has the following fields:

Field name	Type	Description
`next_id`	Number (64 bit unsigned integer)	Identifier to use for the next album, when creating a new one.
`next_thumb_id`	Number (64 bit unsigned integer)	Identifier to use for the next thumbnail asset.
`albums`	Object { Mapping ID -> Album }	List of albums. For each album it maps its identifier to its metadata

The Album object has the following fields:

Field name	Type	Description
`name`	String	Name of the album
`lm`	Number (64 bit integer)	Last modified timestamp. Unix milliseconds format
`list`	Array<Number (64 bit unsigned integer)>	List of media asset identifiers contained in the album
`thumb`	Number (64 bit unsigned integer)	ID of the thumbnail asset of the album. May be null if no thumbnail is set.

Albums thumbnails

The thumbnail files of the albums are stored in a folder, named thumb_album.

This folder contains Single-File encrypted assets, named concatenating the s_ prefix and the asset ID in decimal, with .pma extension.

Tags file

The tags file, named tag_list.pmv is an encrypted JSON file used to store the list of existing tags in the vault.

The file has the following fields:

Field name	Type	Description
`next_id`	Number (64 bit unsigned integer)	Identifier to use for the next tag, when creating a new one.
`tags`	Object { Mapping ID -> String }	List of tags. For each tag, it maps its identifier to its name

User configuration file

The user configuration file, named user_config.pmv is an encrypted JSON file used to store the vault configuration set by the user.

The file has the following fields:

Field name	Type	Description
`title`	String	Vault custom title
`logo`	String	Vault custom logo text
`css`	String	Custom CSS for the frontend
`max_tasks`	Number (32 bit integer)	Max number of tasks to run in parallel
`encoding_threads`	Number (32 bit integer)	Max number of threads to use for a single encoding task
`video_previews_interval`	Number (32 bit integer)	Video previews interval (seconds)
`resolutions`	Array<VideoResolution>	Resolutions to automatically encode when uploading a video
`image_resolutions`	Array<ImageResolution>	Resolutions to automatically encode when uploading an image
`invite_limit`	Number (32 bit integer)	Max number of invites per user
`preserve_originals`	Boolean	Preserve original media before encoding?

The VideoResolution object has the following fields:

Field name	Type	Description
`width`	Number (32 bit unsigned integer)	Width in pixels
`height`	Number (32 bit unsigned integer)	Height in pixels
`fps`	Number (32 bit unsigned integer)	Frames per second

The ImageResolution object has the following fields:

Field name	Type	Description
`width`	Number (32 bit unsigned integer)	Width in pixels
`height`	Number (32 bit unsigned integer)	Height in pixels

Home page configuration file

The user configuration file, named home_page.pmv is an encrypted JSON file used to store the home page configuration for the vault.

The file has the following fields:

Field name	Type	Description
`groups`	Array<HomePageGroup>	Groups of elements to display in the home page.
`next_id`	Number (64 bit unsigned integer)	ID to assign to the next group created by the user.

The HomePageGroup object has the following fields:

Field name	Type	Description
`id`	Number (64 bit unsigned integer)	ID of the group to uniquely identity it.
`type`	Number (8 bit unsigned integer)	Type of group (`0` = custom/default, `1` = recent media, `2` = recent albums)
`name`	String	Name for the group, in order to display it to the user.
`elements`	Array<HomePageElement>	List of ordered elements to display for the group. Only for `type` = `0`

The HomePageElement object has the following fields:

Field name	Type	Description
`t`	Number (8 bit unsigned integer)	Type of element (`0` = media/default, `1` = album)
`i`	Number (64 bit unsigned integer)	Identifier of the media or the album

Main index file

The main index file, named main.index is an index file containing every single media asset identifier existing in the vault.

This file is used to check if a media asset exists and to perform searches when a tag filter is not specified.

Last modified November 22, 2025: Update last release (afa5fe4)