
❌ This issue is not open for contribution. Visit Contributing guidelines to learn about the contributing process and how to find suitable issues.

Overview
This is the FOUNDATION issue for the constants migration project. It establishes the infrastructure and pattern that Issues #2-5 will follow. This issue must be completed before the other migration issues can proceed.
Context
Currently, le_utils/constants/file_formats.py uses the legacy approach:
- Loads
resources/formatlookup.json at runtime with pkgutil.get_data()
- Manual Python constants (
MP4 = "mp4", PDF = "pdf", etc.) must be kept in sync
- Manual
_FORMATLOOKUP dict and getformat() helper function
- No JavaScript export available
- Tests verify Python/JSON sync
This issue migrates it to the modern spec + code generation approach used by 8 other modules.
Scope
This issue will:
- Enhance
generate_from_specs.py to support namedtuple-based constants (the key infrastructure work)
- Create
spec/constants-file_formats.json following the new format
- Generate Python and JavaScript files via
make build
- Update tests to verify against the spec
- Delete
resources/formatlookup.json
- Document the spec format for subsequent tasks
Current Structure
File: le_utils/resources/formatlookup.json (only has 20 formats)
{
"mp4": {"mimetype": "video/mp4"},
"webm": {"mimetype": "video/webm"},
"vtt": {"mimetype": ".vtt"},
"pdf": {"mimetype": "application/pdf"},
...
}
Python module (file_formats.py) currently has 40+ manual constants including:
- Formats in JSON:
MP4, WEBM, VTT, PDF, EPUB, MP3, JPG, JPEG, PNG, GIF, JSON, SVG, GRAPHIE, PERSEUS, H5P, ZIM, HTML5 (zip), BLOOMPUB, BLOOMD, HTML5_ARTICLE (kpub)
- Formats NOT in JSON (these need to be added to spec):
AVI, MOV, MPG, WMV, MKV, FLV, OGV, M4V, SRT, TTML, SAMI, SCC, DFXP
- Namedtuple:
class Format(namedtuple("Format", ["id", "mimetype"])): pass
- LIST, choices tuple, helper function
getformat()
Target Spec Format
Create spec/constants-file_formats.json with ALL formats including those currently missing from JSON:
{
"namedtuple": {
"name": "Format",
"fields": ["id", "mimetype"]
},
"constants": {
"mp4": {"mimetype": "video/mp4"},
"webm": {"mimetype": "video/webm"},
"avi": {"mimetype": "video/x-msvideo"},
"mov": {"mimetype": "video/quicktime"},
"mpg": {"mimetype": "video/mpeg"},
"wmv": {"mimetype": "video/x-ms-wmv"},
"mkv": {"mimetype": "video/x-matroska"},
"flv": {"mimetype": "video/x-flv"},
"ogv": {"mimetype": "video/ogg"},
"m4v": {"mimetype": "video/x-m4v"},
"vtt": {"mimetype": "text/vtt"},
"srt": {"mimetype": "application/x-subrip"},
"ttml": {"mimetype": "application/ttml+xml"},
"sami": {"mimetype": "application/x-sami"},
"scc": {"mimetype": "text/x-scc"},
"dfxp": {"mimetype": "application/ttaf+xml"},
"mp3": {"mimetype": "audio/mpeg"},
"pdf": {"mimetype": "application/pdf"},
"epub": {"mimetype": "application/epub+zip"},
"jpg": {"mimetype": "image/jpeg"},
"jpeg": {"mimetype": "image/jpeg"},
"png": {"mimetype": "image/png"},
"gif": {"mimetype": "image/gif"},
"json": {"mimetype": "application/json"},
"svg": {"mimetype": "image/svg+xml"},
"graphie": {"mimetype": "application/graphie"},
"perseus": {"mimetype": "application/perseus+zip"},
"h5p": {"mimetype": "application/h5p+zip"},
"zim": {"mimetype": "application/zim"},
"zip": {"mimetype": "application/zip"},
"bloompub": {"mimetype": "application/bloompub+zip"},
"bloomd": {"mimetype": "application/bloompub+zip"},
"kpub": {"mimetype": "application/kpub+zip"}
}
}
How to determine mimetypes for missing formats:
Generation Script Enhancement
Update scripts/generate_from_specs.py to handle the namedtuple format:
-
Modify read_constants_specs() to detect and handle namedtuple format:
- Check if spec has
namedtuple key
- If yes, extract namedtuple definition and constants
- If no, use existing simple constant handling
-
Update write_python_file() to support namedtuples:
- Add
from collections import namedtuple import when needed
- Generate namedtuple class definition
- Generate
{MODULE}LIST with namedtuple instances
- Generate uppercase constants from keys (e.g.,
MP4 = "mp4")
- Generate
_MIMETYPE constants (e.g., MP4_MIMETYPE = "video/mp4") for each format
- Generate choices tuple with custom display names (from spec or title-cased)
- Generate lookup dict:
_{MODULE}LOOKUP = {item.id: item for item in {MODULE}LIST}
- Generate helper function (e.g.,
getformat())
-
Update write_js_file() to export rich namedtuple data with PascalCase:
- Export constant name → id mapping (default export, e.g.,
MP4: "mp4")
- Export
FormatsList - full namedtuple data as array
- Export
FormatsMap - Map for efficient lookups
Generated Output Example
Python (le_utils/constants/file_formats.py):
# -*- coding: utf-8 -*-
# Generated by scripts/generate_from_specs.py
from __future__ import unicode_literals
from collections import namedtuple
# FileFormats
class Format(namedtuple("Format", ["id", "mimetype"])):
pass
# Format constants
MP4 = "mp4"
WEBM = "webm"
AVI = "avi"
PDF = "pdf"
# ... (all formats)
# Mimetype constants
MP4_MIMETYPE = "video/mp4"
WEBM_MIMETYPE = "video/webm"
AVI_MIMETYPE = "video/x-msvideo"
PDF_MIMETYPE = "application/pdf"
# ...
choices = (
(MP4, "Mp4"),
(WEBM, "Webm"),
(AVI, "Avi"),
(PDF, "Pdf"),
# ...
)
FORMATLIST = [
Format(id="mp4", mimetype="video/mp4"),
Format(id="webm", mimetype="video/webm"),
Format(id="avi", mimetype="video/x-msvideo"),
# ...
]
_FORMATLOOKUP = {f.id: f for f in FORMATLIST}
def getformat(id, default=None):
"""
Try to lookup a file format object for its `id` in internal representation.
Returns None if lookup by internal representation fails.
"""
return _FORMATLOOKUP.get(id) or None
JavaScript (js/FileFormats.js):
// Generated by scripts/generate_from_specs.py
// Format constants
export default {
MP4: "mp4",
WEBM: "webm",
AVI: "avi",
PDF: "pdf",
// ...
};
// Full format data with mimetypes
export const FormatsList = [
{ id: "mp4", mimetype: "video/mp4" },
{ id: "webm", mimetype: "video/webm" },
{ id: "avi", mimetype: "video/x-msvideo" },
{ id: "pdf", mimetype: "application/pdf" },
// ...
];
// Lookup Map
export const FormatsMap = new Map(
FormatsList.map(format => [format.id, format])
);
This way JavaScript code can:
- Use constants:
import FileFormats from './FileFormats'; if (ext === FileFormats.MP4) ...
- Access full data:
import { FormatsList } from './FileFormats';
- Look up by id:
import { FormatsMap } from './FileFormats'; const format = FormatsMap.get('pdf');
Testing Updates
File: tests/test_formats.py
Update to test against spec instead of old JSON:
import os
import json
spec_path = os.path.join(os.path.dirname(__file__), "..", "spec", "constants-file_formats.json")
with open(spec_path) as f:
spec = json.load(f)
formatlookup = spec["constants"]
# Verify all constants in Python module match spec
# Verify FORMATLIST namedtuples match spec data
# Test getformat() helper
# Verify _MIMETYPE constants
How to Run Tests
# Run file formats tests
pytest tests/test_formats.py -v
# Run all tests to ensure nothing broke
pytest tests/ -v
Acceptance Criteria
Disclosure
🤖 This issue was written by Claude Code, under supervision, review and final edits by @rtibbles 🤖
❌ This issue is not open for contribution. Visit Contributing guidelines to learn about the contributing process and how to find suitable issues.
Overview
This is the FOUNDATION issue for the constants migration project. It establishes the infrastructure and pattern that Issues #2-5 will follow. This issue must be completed before the other migration issues can proceed.
Context
Currently,
le_utils/constants/file_formats.pyuses the legacy approach:resources/formatlookup.jsonat runtime withpkgutil.get_data()MP4 = "mp4",PDF = "pdf", etc.) must be kept in sync_FORMATLOOKUPdict andgetformat()helper functionThis issue migrates it to the modern spec + code generation approach used by 8 other modules.
Scope
This issue will:
generate_from_specs.pyto support namedtuple-based constants (the key infrastructure work)spec/constants-file_formats.jsonfollowing the new formatmake buildresources/formatlookup.jsonCurrent Structure
File:
le_utils/resources/formatlookup.json(only has 20 formats){ "mp4": {"mimetype": "video/mp4"}, "webm": {"mimetype": "video/webm"}, "vtt": {"mimetype": ".vtt"}, "pdf": {"mimetype": "application/pdf"}, ... }Python module (
file_formats.py) currently has 40+ manual constants including:MP4,WEBM,VTT,PDF,EPUB,MP3,JPG,JPEG,PNG,GIF,JSON,SVG,GRAPHIE,PERSEUS,H5P,ZIM,HTML5(zip),BLOOMPUB,BLOOMD,HTML5_ARTICLE(kpub)AVI,MOV,MPG,WMV,MKV,FLV,OGV,M4V,SRT,TTML,SAMI,SCC,DFXPclass Format(namedtuple("Format", ["id", "mimetype"])): passgetformat()Target Spec Format
Create
spec/constants-file_formats.jsonwith ALL formats including those currently missing from JSON:{ "namedtuple": { "name": "Format", "fields": ["id", "mimetype"] }, "constants": { "mp4": {"mimetype": "video/mp4"}, "webm": {"mimetype": "video/webm"}, "avi": {"mimetype": "video/x-msvideo"}, "mov": {"mimetype": "video/quicktime"}, "mpg": {"mimetype": "video/mpeg"}, "wmv": {"mimetype": "video/x-ms-wmv"}, "mkv": {"mimetype": "video/x-matroska"}, "flv": {"mimetype": "video/x-flv"}, "ogv": {"mimetype": "video/ogg"}, "m4v": {"mimetype": "video/x-m4v"}, "vtt": {"mimetype": "text/vtt"}, "srt": {"mimetype": "application/x-subrip"}, "ttml": {"mimetype": "application/ttml+xml"}, "sami": {"mimetype": "application/x-sami"}, "scc": {"mimetype": "text/x-scc"}, "dfxp": {"mimetype": "application/ttaf+xml"}, "mp3": {"mimetype": "audio/mpeg"}, "pdf": {"mimetype": "application/pdf"}, "epub": {"mimetype": "application/epub+zip"}, "jpg": {"mimetype": "image/jpeg"}, "jpeg": {"mimetype": "image/jpeg"}, "png": {"mimetype": "image/png"}, "gif": {"mimetype": "image/gif"}, "json": {"mimetype": "application/json"}, "svg": {"mimetype": "image/svg+xml"}, "graphie": {"mimetype": "application/graphie"}, "perseus": {"mimetype": "application/perseus+zip"}, "h5p": {"mimetype": "application/h5p+zip"}, "zim": {"mimetype": "application/zim"}, "zip": {"mimetype": "application/zip"}, "bloompub": {"mimetype": "application/bloompub+zip"}, "bloomd": {"mimetype": "application/bloompub+zip"}, "kpub": {"mimetype": "application/kpub+zip"} } }How to determine mimetypes for missing formats:
application/{format}orapplication/{format}+zippatternGeneration Script Enhancement
Update
scripts/generate_from_specs.pyto handle the namedtuple format:Modify
read_constants_specs()to detect and handle namedtuple format:namedtuplekeyUpdate
write_python_file()to support namedtuples:from collections import namedtupleimport when needed{MODULE}LISTwith namedtuple instancesMP4 = "mp4")_MIMETYPEconstants (e.g.,MP4_MIMETYPE = "video/mp4") for each format_{MODULE}LOOKUP = {item.id: item for item in {MODULE}LIST}getformat())Update
write_js_file()to export rich namedtuple data with PascalCase:MP4: "mp4")FormatsList- full namedtuple data as arrayFormatsMap- Map for efficient lookupsGenerated Output Example
Python (
le_utils/constants/file_formats.py):JavaScript (
js/FileFormats.js):This way JavaScript code can:
import FileFormats from './FileFormats'; if (ext === FileFormats.MP4) ...import { FormatsList } from './FileFormats';import { FormatsMap } from './FileFormats'; const format = FormatsMap.get('pdf');Testing Updates
File:
tests/test_formats.pyUpdate to test against spec instead of old JSON:
How to Run Tests
Acceptance Criteria
scripts/generate_from_specs.pyenhanced to support namedtuple specsspec/constants-file_formats.jsoncreated with ALL formats (including AVI, MOV, SRT, etc. currently missing)make buildsuccessfully generates Python and JavaScript filesle_utils/constants/file_formats.pyhas:_MIMETYPEconstants for each formatchoicestupleFORMATLISTwith namedtuple instances_FORMATLOOKUPdictgetformat()helper functionjs/FileFormats.jshas:FormatsListexport (PascalCase) with full dataFormatsMapexport (PascalCase) as Maptests/test_formats.pyupdated to test against specpytest tests/ -vresources/formatlookup.jsondeletedDisclosure
🤖 This issue was written by Claude Code, under supervision, review and final edits by @rtibbles 🤖